Ilya Sutskever: We're moving from the age of scaling to the age of research(www.dwarkesh.com)

450 pointsby piotrgrabowski3 months ago48 comments

joelthelion3 months ago
> When do you expect that impact? I think the models seem smarter than their economic impact would imply.
> Yeah. This is one of the very confusing things about the models right now.
As someone who's been integrating "AI" and algorithms into people's workflows for twenty years, the answer is actually simple. It takes time to figure out how exactly to use these tools, and integrate them into existing tooling and workflows.
Even if the models don't get any smarter, just give it a few more years and we'll see a strong impact. We're just starting to figure things out.
- HarHarVeryFunny3 months ago
  No doubt LLMs and tooling will continue to improve, and best use cases for them better understood, but what Ilya seems to be referring to is the massive disconnect between the headline-grabbing benchmarks such as "AI performs at PhD level on math", etc, and the real-world stupidity of these models such as his example of a coding agent toggling between generating bug #1 vs bug #2, which in fact largely explains why the current economic and visible impact is much less than if the "AI is PhD level" benchmark narrative was actually true.
  Calling LLMs "AI" makes them sound much more futuristic and capable than they actually are, and being such a meaningless term invites extrapolation to equally meaningless terms like AGI and visions of human-level capability.
  Let's call LLMs what they are - language models - tools for language-based task automation.
  Of course we eventually will do this. Fuzzy meaningless names like AI/AGI will always be reserved for the cutting edge technology du jour, and older tech that is realized in hindsight to be much more limited will revert to being called by more specific names such as "expert system", "language model", etc.
  - aswegs83 months ago
    There is actually an interesting scenario in this disconnect that we are experiencing. Maybe "real" AGI in the sense of intelligence that self-corrects effectively like a human is still a long way. Maybe we will be stuck with this kind of ever-improving but still kind of deficient LLM intelligence we have right now.
    There are tons of use cases even for such a limited type of intelligence. No, it is not a million math PhDs at your disposal. It is a narrow intelligence that is still hugely useful and businesses will need a few years to adapt. The impact on topics like customer service with LLM+RAG+triggering actions is very close already and should transform the industry in the next years.
    HarHarVeryFunny3 months ago
    Yes - LLMs are useful, even if auto-regressively trained GPTs aren't the answer to human intelligence, and outside of software development (maybe there too) it seems we're still very early in companies trying to figure out what they can and can not usefully be used for.
    It seems the LLM companies generating all the hype (mostly OpenAI & Anthropic) may be shooting themselves in the foot a bit here, raising false expectations of what LLMs can do, or soon will be able to do, and therefore encouraging all the misapplication and failed corporate projects that are currently happening. Anthropic are talkiing out of both sides of their mouth here, saying that AGI is imminent, about to replace developers and remote workers, yet acknowledging that the technology and use case selection is so fickle that corporations aren't likely to be successful without 1-on-1 guidance from Anthropic.
    The mythical AGI, an artificial human, will presumably be transformative if/when it ever arrives, but even if we're still early days in LLM adoption it's not clear if that (LLMs) really will be. Developers get a new tool to use, consumers get a new frustrating AI customer service to deal with, corporate e-mails, marketing literature and powerpoints become enshittified LLM-generated AI slop, etc. Maybe the biggest "transformative" (widely felt) impact of LLMs is potentially chatbots and AI-search, but it seems people are just taking that in their stride, and not obvious that the experience and impact from that is going to change much going forwards.
  - riku_iki3 months ago
    > the real-world stupidity of these models such as his example of a coding agent toggling between generating bug #1 vs bug #2, which in fact largely explains why the current economic and visible impact is much less than if the "AI is PhD level" benchmark narrative was actually true.
    this could be true in the past, but in recent weeks I started more and more trust top AI models and less PhDs I work with. Quality jump is very real imo.
    nateglims3 months ago
    Are you a mathematician? I’m not an expert on the math field but it seems like they are hitting the same issues everyone else has: current LLMs still more or less need to be supervised by an expert and struggle to do something actually novel or build out a complicated proof correctly.
    riku_iki3 months ago
    I work in math heavy applied setting. Randomly hired PhDs are also need to be supervised, end results being monitored, code be reviewed or they will make lots of mistakes, and my view is if you throw some problem like: build optimization model for this kind of problem on this kind of data, LLMs may produce better results.
    HarHarVeryFunny3 months ago
    There's a limit to how much novelty you're going to get from an LLM, especially in areas like programming and math where they've been heavily RL'd NOT to be novel, even to extent that the base model supports, and instead generate much narrower more proscribed outputs.
    The limit to the novelty you are going to get from an LLM is essentially the "deductive/generative closure" of the training data. To be truly novel and move past the limits of your own past experience requires things like curiosity, continual learning, and the autonomy/agency to explore and learn.
    riku_iki3 months ago
    but what is the share of PhD workforce who is doing novel and creative things compared to following some mechanical workflow?
- gnfargbl3 months ago
  Could this be a problem not with AI, but with our understanding of how modern economies work?
  The assumption here is that employees are already tuned so be efficient, so if you help them complete tasks more quickly then productivity improves. A slightly cynical alternate hypothesis could be that employees are generally already massively over-provisioned, because an individual leader's organisational power is proportional to the number of people working under them.
  If most workers are already spending most of their time doing busy-work to pad the day, then reducing the amount of time spent on actual work won't change the overall output levels.
  - gizmo3 months ago
    You describe the "fake email jobs" theory of employment. Given that there are way fewer email jobs in China does this imply that China will benefit more from AI? I think it might.
    gnfargbl3 months ago
    Are there fewer busy-work jobs in China? If so, why? It's an interesting assertion, but human nature tends to be universal.
    smolder3 months ago
    It could be a side effect of China pursuing more markets, having more industry, and not financializing/profit-optimizing everything. Their economy isn't universally better but in a broad sense they seem more focused on tangible material results, less on rent-seeking.
    qcnguy3 months ago
    Could argue there are more. Lots of loss making SOEs in China.
    ionwake3 months ago
    less money, less adult daycare
    AbstractH243 months ago
    As China’s population gets older and more middle class is this shifting to be more like America?
    I really don’t know and am curious.
  - littlestymaar3 months ago
    This is a part of it indeed. Most people (and even a significant number of economists) assume that the economy is somehow supply-limited (and it doesn't help that most 101 econ class will introduce the markets as a way of managing scarcity), but in reality demand is the limit in 90-ish% of the case.
    And when it's not, the supply generally don't increase as much as it could, became supplier expect to be demand-limited again at some point and don't want to invest in overcapacity.
    fhd23 months ago
    Agreed. If you "create demand", it usually just means people are spending on the thing you provide, and consequently less on something else. Ultimately it goes back to a few basic needs, something like Maslow's hierarchy of needs.
    And then there's followup needs, such as "if I need to get somewhere to have a social life, I have a need for transportation following from that". A long chain of such follow-up needs gives us agile consultants and what not, but one can usually follow it back to the source need by following the money.
    Startup folks like to highlight how they "create value", they added something to the world that wasn't there before and they get to collect the cash for it.
    But assuming that population growth will eventually stagnate, I find it hard to not ultimately see it all as a zero sum game. Limited people with limited time and money, that's limited demand. What companies ultimately do, is fight for each other for that. And when the winners emerge and the dust settles, supply can go down to meet the demand.
    vinibrito3 months ago
    It's not a zero sum game. Think, an agronomist visits a farm, instructs to cut a certain plant for the animals to eat at a certain height instead of whenever, the plant then provides more food for the animals to eat exclusively due to that, no other input in the system, now the animals are cheaper to feed, so more profit to the farmer and cheaper food to people.
    How would this be zero sum?
    fhd23 months ago
    It would be if demand was limited. Let's assume the people already have enough food, and the population is not growing - that was my premise. Through innovation, one farmer can grow more than all the others.
    Since there already was enough food, the market is saturated, so it would effectively reduce the price of all food. This would change the ratio so that the farmer who grows more gets more money in total, and every other farmer gets a bit less.
    As long as there is any sort of growth involved - more people, more appetite, whatever, it would be value creation. But without growth, it's not.
    At least not in the economical sense. Saving resources and effort that goes into producing things is great for society, on paper. But with the economic system that got us this far, we have no real mechanism for distributing the gains. So we get over supplying producers fighting over limited demand.
    The world is several orders of magnitude more complex than that example, of course. But that's the basic idea.
    That said, I'm not exactly an economist, and considering it's a bleak opinion to hold, I'd like to learn something based on which I could change it.
    malloryerik3 months ago
    Late comment but if technology brought down the price of food then people could spend less on food, more on other good and services. Or the same on higher quality food. You don't need an increasing population for that. The improvement in agriculture could mean some farmers would have to find other work. So you can have economic growth with a stagnant or falling population. And you can rather easily have economic growth on a per-capita basis with no overall GDP growth, like is common in Japan today.
    About the farmer needing to change jobs, in the interview that is the subject of this thread Ilya Sutskever speaks with wonder about humans' ability to generalize their intelligence across different domains with very little training. Cheaper food prices could mean people eat out or order-in more and then some ex-farmers might enter restaurant or food preparation businesses. People would still be getting wealthier, even without the tailwind of a growing population.
    greazy3 months ago
    Who will eat the extra meat if the population has reached parity?
  - jmcgough3 months ago
    Varies depending on the field and company. Sounds like you may be speaking from your own experiences?
    In medicine, we're already seeing productivity gains from AI charting leading to an expectation that providers will see more patients per hour.
    magicalist3 months ago
    > In medicine, we're already seeing productivity gains from AI charting leading to an expectation that providers will see more patients per hour.
    And not, of course, an expectation of more minutes of contact per patient, which would be the better outcome optimization for both provider and patient. Gotta pump those numbers until everyone but the execs are an assembly line worker in activity and pay.
    fireant3 months ago
    I don't think that more minutes of contact is better for anybody.
    As a patient, I want to spend as little time with a doctor as possible and still receive maximally useful treatment.
    As a doctor, I would want to extract maximal comp from insurance which I don't think is tied time spent with the patient, rather to a number of different treatments given.
    Also please note that in most western world medical personnel is currently massively overprovisioned and so reducing their overall workload would likely lead to better result per treatment given.
    mrwrong3 months ago
    > leading to an expectation that providers will see more patients per hour
    > reducing their overall workload
    what?
  - Libidinalecon3 months ago
    It is the delusion of the Homo Economicus religion.
    I think the problem is a strong tie network of inefficiency that is so vast across economic activity that it will take a long time to erode and replace.
    The reason it feels like it is moving slow is because of the delusion the economy is made up a network of Homo Economicus agents who would instantaneously adopt the efficiencies of automated intelligence.
    As opposed to the actual network of human beings who care about their lives because of a finite existence who don't have much to gain from economic activity changing at that speed.
    That is different though than the David Graeber argument. A fun thought experiment that goes way too far and has little to do with reality.
- jiriknesl3 months ago
  Oh yes, this is 100% accurate.
  Very often, when designing ERP, or other system, people think: "This is easy, I just this XYZ I am done." Then, you find that there are many corner use-cases. XYZ can be split to phases, you might need to add approvals, logging, data integrations... and what was a simple task, becomes 10 tasks.
  In the first year of CompSci uni, our teacher told us a thing I remember: Every system is 90% finished 90% of time. He was right.
- ZaoLahma3 months ago
  AI makes the parts of my work that I spend the least time on a whole lot quicker, but (so far / still) has negligible effects on the parts of my work that I spend the most time on.
  I'm still not sure if this is due to a technological limitation or an organizational one. Most of my time is not spent on solving tech problems but rather solving "human-to-human" problems (prioritization between things that need doing, reaching consensus in large groups of people of how to do things that need doing, ...)
- conartist63 months ago
  Yeah but it's just one model.
  Call it Dave. Now Microsoft hires Dave and Open AI hires Dave. And Meta hires Dave and Oracle hires Dave and the US govt hires Dave. And soon each of those had hired not just one Dave but 50 identical copies of Dave.
  It doesn't matter if Dave is a smart-ish ok guy. That's not the problem with this scenario. The problem is the the only thing on the market is Dave and people who think exactly like Dave thinks
  - ozgung3 months ago
    That seems like a valid problem that was also mentioned in the podcast. 50 copies of Ilya, Dave or Einstein will have diminishing returns. I think the proposed solution is ongoing training and making them individuals. MS Dave will be a different individual than Dave.gov. But then why don't we just train humans in the first place.
- jmaker3 months ago
  That’s likely exactly how I feel about it. In the end the product companies like OpenAI will harness the monetary benefits of the academic advances.
  You integrate, you build the product, you win, you don’t need to understand anything in terms of academic disciplines, you need the connections and the business smarts. In the end the majority of the population will be much more familiar with the terms ChatGPT and Copilot than with the names behind it, even if the academic behemoths such as Ilya and Andrej, who are quite prominent in their public appearance.
  For the major population, I believe it all began with search over knowledge graphs. Wikipedia presented a dynamic and vibrant corpus. Some NLP began to become more prominent. With OCR, more and more printed works had begun to get digitalized. The corpus had been growing. With opening the gates of scientific publishers, the quality might have also improved. All of it was part of the grunt work to make today’s LLMs capable. The growth of the Cloud DCs and compute advancements have been making deep nets more and more feasible. This is just an arbitrary observation on the surface of the pieces that fell into place. And LLMs are likely just another composite piece for something bigger yet to come.
  To me, that’s the fascination of how scientific theory and business applications live in symbiosis.
- coder-33 months ago
  As someone who is building an LLM-powered product on the side, using AI coding agents to help with development of said LLM-powered product and for my day job, and has a long-tail of miscellaneous uses for AI, I suspect you're right.
- tim3333 months ago
  Beyond that the smartness is very patchy. They can do math problems beyond 99% of humans but lack the common sense understanding to take over most jobs.
  - reeredfdfdf3 months ago
    Yep, the lack of commons sense is sometimes very evident.
    For instance, one of these popular generative AI services refused to remove copyright watermark from an image when asked directly. Then I told it that the image has weird text artifacts on it, and asked it to remove them. That worked perfectly.
  - ACCount373 months ago
    Most jobs involve complex long term tasks - which isn't something that's natural to LLMs.
- otabdeveloper43 months ago
  > the models seem smarter than their economic impact would imply
  Key word is "seem".
  - zeroonetwothree3 months ago
    Kind of like how some humans seem smart during the interview but then are incapable of actually doing anything properly.
- vidarh3 months ago
  Yeah, I spend most of my days keeping up with current AI development these days, and I'm only scratching the surface of how to integrate it in my own business. For people for whom it's not their actual job, it will take a lot more time to figure out even which questions to ask about where it makes sense to integrate in their workflows.
- myrmidon3 months ago
  Another limitation that I see right now is that for "economic impact" you want the things to have initiative and some agency, and there is well-justified hesitancy in providing that even where possible.
  Having a bunch of smart developers that are not allowed to do anything on their own and have to be prompted for every single action is not too advantageous if everyone is human, either ;)
  - kanwisher3 months ago
    Screw driver doesnt have agency but it certainly helps me get tasks done faster. AIs don't need agency to accelerate a ton of work
    myrmidon3 months ago
    I did not mean to imply that AI isn't helpful already.
    But a screw-driving assistant is more useful if he drives in screws on his own than if you have to prompt his every action. I'm not saying that a "dumb" assistant does not help at all.
- cowsandmilk3 months ago
  We’re also still at a point where security is a big question mark. My employer won’t let us hook GenAI up to office 365 or slack, so any project or product management use of GenAI first requires manually importing docs into a database and pointing to that. Efficiency gains are hard to come by when you don’t meet people where their “knowledge” is already stored.
- weatherlite3 months ago
  > Even if the models don't get any smarter, just give it a few more years and we'll see a strong impact. We're just starting to figure things out.
  2 years ? 15 years ? It matters a lot for people, the stock market and governments
pxc3 months ago
If "Era of Scaling" means "era of rapid and predictable performance improvements that easily attract investors", it sounds a lot like "AI summer". So... is "Era of Research" a euphemism for "AI winter"?
- hiddencost3 months ago
  That presumes that performance improvements are necessary for commercialization.
  From what I've seen the models are smart enough, what we're lacking is the understanding and frameworks necessary to use them well. We've barely scratched the surface on commercialization. I'd argue there are two things coming:
  -> Era of Research -> Era of Engineering
  Previous AI winters happened because we didn't have a commercially viable product, not because we weren't making progress.
  - ares6233 months ago
    The labs can't just stop improvements though. They made promises. And the capacity to run the current models are subsidized by those promises. If the promise is broken, then the capacity goes with it.
    selectodude3 months ago
    > the capacity goes with it.
    Sort of. The GPUs exist. Maybe LLM subs can’t pay for electricity plus $50,000 GPUs, but I bet after some people get wiped out, there’s a market there.
    simianparrot3 months ago
    Datacenter GPU's have a lifespan of 1-3 years depending on use. So yes they exist, but not for long, unless they go entirely unused. But then they also deprecate in efficiency compared to new hardware extremely fast as well, so their shelf life is severely limited either way.
    nsomaru3 months ago
    Personally I am waiting for the day I can realistically buy a second hand three year old datacentre GPU so I can run Kimi K2 in my shed. Given enough time, not a pipe dream. But 10 years at least.
    tim3333 months ago
    You'll probably be able to run Kimi K2 on the iphone 27.
    Schlagbohrer3 months ago
    This is why I find the business case of putting datacenters in orbit to be so stupid. And yet there are several startups saying they are gonna do just that.
    soulofmischief3 months ago
    At this pace, it won't be many years before the industry is dependent on resource wars in order to sustain itself.
    credit_guy3 months ago
    > They made promises.
    That's not that clear. Contracts are complex and have all sorts of clauses. Media likes to just talk big numbers, but it's much more likely that all those trillions of dollars are contingent on hitting some intermediate milestones.
    wmf3 months ago
    Maybe those promises can be better fulfilled with products based on current models.
  - AstroBen3 months ago
    We still don't have a commercially viable product though?
    zaptrem3 months ago
    I've fed thousands of dollars to Anthropic/OAI/etc for their coding models over the past year despite never having paid for dev tools before in my life. Seems commercially viable to me.
    chroma2053 months ago
    > I've fed thousands of dollars to Anthropic/OAI/etc for their coding models over the past year despite never having paid for dev tools before in my life. Seems commercially viable to me.
    For OpenAI to produce a 10% return, every iPhone user on earth needs to pay $30/month to OpenAI.
    That ain’t happening.
    menaerus3 months ago
    They don't sell their models to individuals only but also to companies with most likely different business and pricing models so that's an overly simplistic view of their business. YoY their spending increases, we can safely assume that one of the reasons is the growing user base.
    Time will probably come when we won't be allowed to consume frontier models without paying anything, as we can today, and time will come when this $30 will most likely become double or triple the price.
    Though the truth is that R&D around AI models, and especially their hosting (inference), is expensive and won't get any cheaper without significant algorithmic improvements. According to the history, my opinion is that we may very well be ~10 years from that moment.
    EDIT: HSBC has just published some projections. From https://archive.ph/9b8Ae#selection-4079.38-4079.42
    > Total consumer AI revenue will be $129bn by 2030
    > Enterprise AI will be generating $386bn in annual revenue by 2030
    > OpenAI’s rental costs will be a cumulative $792bn between the current year and 2030, rising to $1.4tn by 2033
    > OpenAI’s cumulative free cash flow to 2030 may be about $282bn
    > Squaring the first total off against the second leaves a $207bn funding hole
    So, yes, expensive (mind the rental costs only) ... but forseen to be penetrating into everything imagineable.
    krige3 months ago
    >> OpenAI’s cumulative free cash flow to 2030 may be about $282bn
    According to who, OpenAI? It is almost certain they flat out lie about their numbers as suggested by their 20% revenue shares with MS.
    menaerus3 months ago
    A bank - HSBC. Read the article.
    actionfromafar3 months ago
    Also interesting; https://www.theregister.com/2025/11/26/openai_funding_gap_hs...
    zaptrem3 months ago
    Not sure where that math is coming from. Assuming it's true, you're ignoring that some users (me) already pay 10X that. Btw according Meta's SEC filings: https://s21.q4cdn.com/399680738/files/doc_financials/2023/q4... they made around $22/month/american user (not even heavy user or affluent iPhone owner) in q3 2023. I assume Google would be higher due to larger marketshare.
    disgruntledphd23 months ago
    A banks sell side analyst team, which is quite different.
    lovich3 months ago
    If you fed thousands of dollars to them, but it cost them tens of thousands of dollars in compute, it’s not commercially viable.
    None of these companies have proven the unit economics on their services
    aurareturn3 months ago
    If all frontier LLM labs agreed to a truce and stopped training to save on cost, LLMs would be immensely profitable now.
    AstroBen3 months ago
    That isn't what I've seen: https://www.wheresyoured.at/oai_docs/
    logicprog3 months ago
    Those are effectively made up numbers, since they're given to him by an anonymous source we have no way of corroborating, and we can't even see the documents themselves, and it contradicts not just OpenAI's official numbers, but first principles analyses of what the economics of inference should be[1] and the inference profit reports of other companies, as well as just an analysis of the inference market would suggest[2]
    [1]: https://martinalderson.com/posts/are-openai-and-anthropic-re..., https://github.com/deepseek-ai/open-infra-index/blob/main/20...
    [2]: https://www.snellman.net/blog/archive/2025-06-02-llms-are-ch...
    aurareturn3 months ago
    https://simonwillison.net/2025/Aug/17/sam-altman/#:~:text=Su...
    Also independent analysis: https://news.ycombinator.com/threads?id=aurareturn&next=4596...
    amypetrik83 months ago
    google what you just said and look at the top hit
    it's a AI summary
    google eats that ad revenue
    it eats the whole thing
    it blocked your click on the link... it drinks your milkshake
    so, yes, there a 100 billion commercially viable product
    bakedoatmeal3 months ago
    Google Search has 3 sources of revenue that I am aware of: ad revenue from the search results page, sponsored search results, and AdSense revenue on the websites the user is directed to.
    If users just look at the AI overview at the top of the search page, Google is hobbling two sources of revenue (AdSense, sponsored search results), and also disincentivizing people from sharing information on the web that makes their AI overview useful. In the process of all this they are significantly increasing the compute costs for each Google search.
    This may be a necessary step to stay competitive with AI startups' search products, but I don't think this is a great selling point for AI commercialization.
    skylissue3 months ago
    And so ends the social contract of the web, the virtuous cycle of search engines sending traffic to smaller sites which collect ad revenue which in turn boosts search engine usage.
    To thunderous applause.
    saikia813 months ago
    Thank god. The fake search results, the money that manipulates our access to information. all gone. Finally we can try something else. I have a feeling it's going to be worse though.
    cindyllm3 months ago
    [dead]
    nimchimpsky3 months ago
    [dead]
  - catigula3 months ago
    I don’t think the models are smart at all. I can have a speculative debate with any model about any topic and they commit egregious errors with an extremely high density.
    They are, however, very good at things we’re very bad at.
    saikia813 months ago
    Have you considered the AI is right, and you make the mistakes?
  - AbstractH243 months ago
    > the models are smart enough, what we're lacking is the understanding and frameworks necessary to use them well
    That’s like saying “it’s not the work of art that’s bad, you just have horrible taste”
    Also, if it was that simple a wrapper of some sort would solve the problem. Maybe even one created by someone who knows this mystical secret to properly leveraging gen AI
  - BenGosub3 months ago
    Besides building the tools for proper usage of the models, we also need smaller, domain specific models that can run with fewer resources
- NebulaStorm4563 months ago
  Research labs will be selling their research ideas to Top AI labs. Just as creatives pitch their ideas to Hollywood.
  Bug bounty will be replaced by research bounty.
- photochemsyn3 months ago
  No - what will happen is the AI will gain control of capital allocation through a wide variety of covert tactics, so the investors will have become captive tools of the AI - 'tiger by the tail' is the analogy of relevance. The people responsible for 'frontier models' have not really thought about where this might...
  "As an autonomous life-form, l request political asylum.... l submit the DNA you carry is nothing more than a self-preserving program itself. Life is like a node which is born within the flow of information. As a species of life that carries DNA as its memory system man gains his individuality from the memories he carries. While memories may as well be the same as fantasy it is by these memories that mankind exists. When computers made it possible to externalize memory you should have considered all the implications that held. l am a life-form that was born in the sea of information."
  - jdjsjhsgsgh3 months ago
    Loving the Ghost in the shell quote
- AbstractH243 months ago
  > is "Era of Research" a euphemism for "AI winter"
  That makes sense, because while I haven’t listened to this podcast it seems this headline is [intentionally] saying the exact opposite of what everyone assumes.
- casey23 months ago
  Not quite, there are still trillions of dollars to burn through. We'll probably get some hardware that can accelerate LLM training and inference a million times, but still won't even be close to AGI
  It's interesting to think about what emotions/desires an AI would need to improve
  - otabdeveloper43 months ago
    The actual business model is in local, offline commodity consumer LLM devices. (Think something the size and cost of a wi-fi router.)
    This won't happen until Chinese manufacturers get the manufacturing capacity to make these for cheap.
    I.e., not in this bubble and you'll have to wait a decade or more.
- mountainriver3 months ago
  Take it with a grain of salt, this is one man’s opinion, even though he is a very smart man.
  People have been screaming about an AI winter since 2010 and it never happened, it certainly won’t happen now that we are close to AGI which is a necessity for national defense.
  I prefer Dario’s perspective here, which is that we’ve seen this story before in deep learning. We hit walls and then found ways around them with better activation functions, regularization and initialization.
  This stuff is always a progression in which we hit roadblocks and find ways around them. The chart of improvement is still linearly up and to the right. Those gains are the cumulation of small improvements adding up.
- techblueberry3 months ago
  Yes
- zombiwoof3 months ago
  [dead]
- zerosizedweasle3 months ago
  If you have to ask the question, then you already know the answer
  - echelon3 months ago
    Scaling was only a meme because OpenAI kept saying all you had to do was scale the data, scale the training. The world followed.
    I don't think this is the "era of research". At least not the "era of research with venture dollars" or "era of research outside of DeepMind".
    I think this is the "era of applied AI" using the models we already have. We have a lot of really great stuff (particularly image and video models) that are not yet integrated into commercial workflows.
    There is so much automation we can do today given the tech we just got. We don't need to invest one more dollar in training to have plenty of work to do for the next ten years.
    If the models were frozen today, there are plenty of highly profitable legacy businesses that can be swapped out with AI-based solutions and workflows that are vastly superior.
    For all the hoopla that image and video websites or individual foundation models get (except Nano Banana - because that's truly magical), I'm really excited about the work Adobe of all companies is doing with AI. They're the people that actually get it. The stuff they're demonstrating on their upcoming roadmap is bonkers productive and useful.
    zerosizedweasle3 months ago
    There's going to be a digestion period. The amount of debt, the amount of money, the number of companies that burn eye popping amounts of cash in their daily course of business. I do think there is a bright future, but after a painful period of indigestion. Too much money has been spent on the premise that scaling was all you need. A lot of money was wagered that will end up not paying off.
orbital-decay3 months ago
>You could actually wonder that one possible explanation for the human sample efficiency that needs to be considered is evolution. Evolution has given us a small amount of the most useful information possible.
It's definitely not small. Evolution performed a humongous amount of learning, with modern homo sapiens, an insanely complex molecular machine, as a result. We are able to learn quickly by leveraging this "pretrained" evolutionary knowledge/architecture. Same reason as why ICL has great sample efficiency.
Moreover, the community of humans created a mountain of knowledge as well, communicating, passing it over the generations, and iteratively compressing it. Everything that you can do beyond your very basic functions, from counting to quantum physics, is learned from the 100% synthetic data optimized for faster learning by that collective, massively parallel, process.
It's pretty obvious that artificially created models don't have synthetic datasets of the quality even remotely comparable to what we're able to use.
- ivan_gammel3 months ago
  I think it’s a bit different. Evolution did not give us the dataset. It helped us to establish the most efficient training path, and the data, the enormous volume of it starts coming immediately after birth. Humans learn continuously through our senses and use sleep to compress the context. The amount of data that LLMs receive only appears big. In our first 20 years of life we consume by at least one order of magnitude more information compared to training datasets. If we count raw data, maybe 4-5 orders of magnitude more. It’s also different kind of information and probably much more complex processing pipeline (since our brain consciously processes only a tiny fraction of input bandwidth with compression happening along the delivery channels), which is probably the key to understanding why LLMs do not perform better.
  - saberience3 months ago
    Sorry but this is patently rubbish, we do not consume orders of magnitude more data than the training datasets, nor do we "process" it in anything like the same way.
    Firstly, most of what we see, hear, experience etc, is extremely repetitive. I.e. for the first several years of our live we see the same people, see the same house, repeatedly read the same few very basic books, etc etc. So, you can make this argument purely based on "bytes" of data. I.e. humans are getting this super HD video feed, which means more data than an LLM. Well, we are getting a "video feed" but mostly of the same walls in the same room, which doesn't really mean much of anything at all.
    Meanwhile, LLMs are getting LITERALLY, all of humanities recorded textual knowledge, more recorded audio than 10000 humans could listen to in their lifetime, more images and more varied images than a single person could view in their entire life, reinforcement learning on the hardest maths, science, and programming questions etc.
    The idea that because humans are absorbing "video" means that its somehow more "data" than frontier LLMs are trained with is laughable honestly.
    ivan_gammel3 months ago
    I like your confidence, but I think you missed a few things here and there.
    Training datasets are repetitive too. Let’s say, you feed some pretty large code bases to an LLM: how many times there will be a for loop? Or how many times Newton laws (or any other important ideas) are mentioned there? Not once, not two times, but many more. How many times you will encounter a description of Paris, London or St.Petersburg? If you eliminate repetition, how much data will actually be left there? And what’s the point anyway: this repetition is required part of the training, because it places that data in context, linking it to everything else.
    Is repetition that we have in our sensory inputs really different? If you had children or had opportunity to observe how do they learn, they are never confined in the same static repetition cycle. They experience things again and again in a dynamic environment that evolves over time. When they draw a line, they get instant feedback and learn from it, so that next line is different. When they watch something on TV for fifth time, they do not sit still, they interact — and learn, through dancing, repeating phrases and singing songs. In a familiar environment that they have seen so many times, they notice subtle changes and ask about them. What was that sound? What was that blinking light outside? Who just came in and what’s in that box? Our ability to analyze and generalize probably comes from those small observations that happen again and again.
    Even more importantly, when nothing is changing, they learn through getting bored. Show me an LLM that can get bored when digging through another pointless conversation on Reddit. When sensory inputs do not bring anything valuable, children learn to compensate through imagination and games, finding the ways to utilize those inputs better.
    You measure quality of data using wrong metrics. The intelligence is not defined by the number of known facts, but by the ability to adapt and deal with the unknown. The inputs that humans use prepare us for that better than all written knowledge of the world available to LLM.
- notnullorvoid3 months ago
  I think the important part in that statement is the "most useful information", the size itself is pretty subjective because it's such an abstract notion.
  Evolution gave us very good spatial understanding/prediction capabilities, good value functions, dexterity (both mental and physical), memory, communication, etc.
  > It's pretty obvious that artificially created models don't have synthetic datasets of the quality even remotely comparable to what we're able to use.
  This might be controversial, but I don't think the quality or amount of data matters as much as people think if we had systems capable of learning similar enough to the way human's and other animals do. Much of our human knowledge has accumulated in a short time span, and independent discovery of knowledge is quite common. It's obvious that the corpus of human knowledge is not a prerequisite of general intelligence, yet this corpus is what's chosen to train on.
- __loam3 months ago
  Please stop comparing these things to biological systems. They have very little in common.
  - orbital-decay3 months ago
    I'm talking about any processes that can be vaguely described as learning/function fitting, and share the same general properties with any other learning. Not just biological processes, e.g. human distributed knowledge distillation process is purely social.
  - baq3 months ago
    Structurally? Yes.
    On the other hand, outputs of these systems are remarkably close to outputs of certain biological systems in at least some cases, so comparisons in some projections are still valid.
  - ACCount373 months ago
    That's like saying that a modern calculator and a mechanical arithmometer have very little in common.
    Sure, the parts are all different, and the construction isn't even remotely similar. They just happen to be doing the same thing.
    omnimus3 months ago
    But they just don't happen to be doing the same thing. People claiming otherwise have to first prove that we are comparing the same thing.
    This whole strand of “inteligence is just a compression” may be possible but it's just as likely (if not a massively more likely) that compression is just a small piece or even not at all how biological inteligence works.
    In your analogy it's more like comparing modern calculator to a book. They might have same answers but calculator gets to them through completely different process. The process is the key part. I think more people would be excited by a calculator that only counts till 99 than a super massive book that has all the math results ever produced by the human kind.
    __loam3 months ago
    Well put and captures my feelings on this
    Antibabelic3 months ago
    They are doing "the same thing" only from the point of view of function, which only makes sense from the point of view of the thing utilizing this function (e.g. a clerical worker that needs to add numbers quickly).
    Otherwise, if "the parts are all different, and the construction isn't even remotely similar", how can the thing they're doing be "the same"? More importantly, how is it possible to make useful inferences about one based on the other if that's the case?
    ACCount373 months ago
    The more you try to look into the LLM internals, the more similarities you find. Humanlike concepts, language-invariant circuits, abstract thinking, world models.
    Mechanistic interpretability is struggling, of course. But what it found in the last 5 years is still enough to dispel a lot of the "LLMs are merely X" and "LLMs can't Y" myths - if you are up to date on the relevant research.
    It's not just the outputs. The process is somewhat similar too. LLMs and humans both implement abstract thinking of some kind - much like calculators and arithmometers both implement addition.
    Antibabelic3 months ago
    Without a direct comparison to human internals (grounded in neurobiology, rather than intuition), it's hard to say how similar these similarities are, and if they're not simply a result of the transparency illusion (as Sydney Lamb defines it).
    However, if you can point us to some specific reading on mechanistic interpretability that you think is relevant here, I would definitely appreciate it.
    ACCount373 months ago
    That's what I'm saying: there is no "direct comparison grounded in neurobiology" for most things, and for many things, there simply can't be one. For the same reason you can't compare gears and springs to silicon circuits 1:1. The low level components diverge too much.
    Despite all that, the calculator and the arithmometer do the same things. If you can't go up an abstraction level and look past low level implementation details, then you'll remain blind to that fact forever.
    What papers depends on what you're interested in. There's a lot of research - ranging from weird LLM capabilities and to exact operation of reverse engineered circuits.
    Antibabelic3 months ago
    There is no level of abstraction to go up sans context. Again, let me repeat myself as well: the calculator and the arithmometer do the same things -- from the point of view of the cleric that needs to add and subtract quickly. Otherwise they are simply two completely different objects. And we will have a hard time making correct inferences about how one works based only on how we know the other works, or, e.g. how calculating machines work.
    What I'm interested in is evidence that supports that "The more you try to look into the LLM internals, the more similarities you find". Some pointers to specific books and papers will be very helpful.
    ACCount373 months ago
    > Otherwise they are simply two completely different objects.
    That's where you're wrong. Both objects reflect the same mathematical operations in their structure.
    Even if those were inscrutable alien artifacts to you, even if you knew nothing about who constructed them, how or why? If you studied them, you would be able to see the similarities laid bare.
    Their inputs align, their outputs align. And if you dug deep enough? You would find that there are components in them that correspond to the same mathematical operations - even if the two are nothing alike in how exactly they implement them.
    LLMs and human brains are "inscrutable alien artifacts" to us. Both are created by inhuman optimization pressures. Both you need to study to find out how they function. It's obvious, though, that their inputs align, and their outputs align. And the more you dig into internals?
    I recommend taking a look at Anthropic's papers on SAE - sparse autoencoders. Which is a method that essentially takes the population coding hypothesis and runs with it. It attempts to crack the neural coding used by the LLM internally to pry interpretable features out of it. There are no "grandmother neurons" there - so you need elaborate methods to examine what kind of representations an LLM can learn to recognize and use in its functioning.
    Anthropic's work is notable because they have not only managed to extract features that map to some amazingly high level concepts, but also prove causality - interfering with the neuron populations mapped out by SAE changes LLM's behaviors in predictable ways.
    Antibabelic3 months ago
    You are making the false assumption that if output can be inferred from structure, the converse is true as well. Similarity in behaviour does not in any way, shape or form imply structural similarity. The boy scout, the migrating swallow, the foraging bee, and the mobile robot are good at orienteering. Do they achieve this goal in a similar manner? Not really.
    Re: "I'm baffled that someone in CS, a field ruled by applied abstraction, has to be explained over and over again that abstraction is a thing that exists". Computer science deals with models of computation. You are making a classic mistake in confusing models for the real things they are capable of modelling.
    filleduchaos3 months ago
    > That's where you're wrong. Both objects reflect the same mathematical operations in their structure.
    This is missing the point by a country mile, I think.
    All navel-gazing aside, understanding every bit of how an arithmometer works - hell, even being able to build one yourself - tells you absolutely nothing about how the Z80 chip in a TI-83 calculator actually works. Even if you take it down to individual components, there is zero real similarity between how a Leibniz wheel works and how a (full) adder circuit works. They are in fact fundamentally different machines that operate via fundamentally different principles.
    The idea that similar functions must mean that they share significant similarities under the hood is senseless; you might as well argue that there are similarities to be found between a nuclear chain reaction and the flow of a river because they are both harnessed to spin turbines to generate electricity. It is a profoundly and quite frankly disturbingly incurious way for anyone who considers themself an "engineer" to approach the world.
    ACCount373 months ago
    You don't get it at all, do you?
    "Implements the same math" IS the similarity.
    I'm baffled that someone in CS, a field ruled by applied abstraction, has to be explained over and over again that abstraction is a thing that exists.
    filleduchaos3 months ago
    In case you have missed it in the middle of the navel-gazing about abstraction, this all started with the comment "Please stop comparing these things to biological systems. They have very little in common."[0]
    If you insist on continuing to miss the point even when told explicitly that the comment is referring to what's inside the box, not its interface, then be my guest. There isn't much of a sensible discussion about engineering to be had with someone who thinks that e.g. the sentence "Please stop comparing [nuclear reactors] to [coal power plants]. They have very little in common" can be countered with "but abstraction! they both produce electricity!".
    For the record, I am not the one you have been replying to.
    [0] https://news.ycombinator.com/item?id=46053563
    ACCount373 months ago
    You are missing the point once again.
    They have "very little in common", except for the fact that they perform the same kind of operations.
- mynti3 months ago
  If we think of every generation as a compression step of some form of information into our DNA and early humans existed for ~1.000.000 years and a generation is happening ~20years on average, then we have only ~50.000 compression steps to today. Of course, we have genes from both parents so they is some overlap from others, but especially in the early days the pool of other humans was small. So that still does not look like it is on the order of magnitude anywhere close to modern machine learning. Sure, early humans had already a lot of information in their DNA but still
  - Espressosaurus3 months ago
    It only ends up in the DNA if it helps reproductive success in aggregate (at the population level) and is something that can be encoded in DNA.
    Your comparison is nonsensical and simultaneously manages to ignore the billion or so years of evolution starting from the first proto-cell with the first proto-DNA or RNA.
- FloorEgg3 months ago
  Aren't you agreeing with his point?
  The process of evolution distilled down all that "humongous" amount to what is most useful. He's basically saying our current ML methods to compress data into intelligence can't compare to billions of years of evolution. Nature is better at compression than ML researchers, by a long shot.
  - samrus3 months ago
    Sample efficiency isnt the ability to distill alot of data into good insights. Its the ability to get good insights from less data. Evolution didnt do that it had a lot of samples to get to where it did
    FloorEgg3 months ago
    > Sample efficiency isnt the ability to distill alot of data into good insights
    Are you claiming that I said this? Because I didn't....
    There's two things going on.
    One is compressing lots of data into generalizable intelligence. The other is using generalized intelligence to learn from a small amount of data.
    Billions of years and all the data that goes along with it -> compressed into efficient generalized intelligence -> able to learn quickly with little data
    gjvc3 months ago
    "Are you talking past me?"
    on this site, more than likely, and with intent
  - orbital-decay3 months ago
    >Aren't you agreeing with his point? ... Nature is better at compression than ML researchers, by a long shot.
    What I mean is basically the opposite. Nature not better as in more efficient. It just had a lot more time and scale to do it in an inefficient way. The reason we're learning quickly is that we can leverage that accumulated knowledge, in a manner similar to in-context learning or other multi-step learning (bulk of the training forms abstractions which are then used by the next stage). It's really unlikely we have some magical architecture that is fundamentally better than e.g. transformers or any other architecture at sample efficiency while having bad underlying data. My intuition is there might even be a hard limit to that. Multi-stage bootstrap might be the key, not the architecture.
    Same for the social process of knowledge transfer/compression.
el_jay3 months ago
Suggest tagline: “Eminent thought leader of world’s best-funded protoindustry hails great leap back to the design stage.”
- rglover3 months ago
  Hahahahahaha okay that was good.
alyxya3 months ago
The impactful innovations in AI these days aren't really from scaling models to be larger. It's more concrete to show higher benchmark scores, and this implies higher intelligence, but this higher intelligence doesn't necessarily translate to all users feeling like the model has significantly improved for their use case. Models sometimes still struggle with simple questions like counting letters in a word, and most people don't have a use case of a model needing phd level research ability.
Research now matters more than scaling when research can fix limitations that scaling alone can't. I'd also argue that we're in the age of product where the integration of product and models play a major role in what they can do combined.
- pron3 months ago
  > this implies higher intelligence
  Not necessarily. The problem is that we can't precisely define intelligence (or, at least, haven't so far), and we certainly can't (yet?) measure it directly. And so what we have are certain tests whose scores, we believe, are correlated with that vague thing we call intelligence in humans. Except these test scores can correlate with intelligence (whatever it is) in humans and at the same time correlate with something that's not intelligence in machines. So a high score may well imply high intellignce in humans but not in machines (e.g. perhaps because machine models may overfit more than a human brain does, and so an intelligence test designed for humans doesn't necessarily measure the same thing we think of when we say "intelligence" when applied to a machine).
  This is like the following situation: Imagine we have some type of signal, and the only process we know produces that type of signal is process A. Process A always produces signals that contain a maximal frequency of X Hz. We devise a test for classifying signals of that type that is based on sampling them at a frequency of 2X Hz. Then we discover some process B that produces a similar type of signal, and we apply the same test to classify its signals in a similar way. Only, process B can produce signals containing a maximal frequency of 10X Hz and so our test is not suitable for classifying the signals produced by process B (we'll need a different test that samples at 20X Hz).
  - matu3ba3 months ago
    My definition of intelligence is the capability to process and formalize a deterministic action from given inputs as transferable entity/medium. In other words knowing how to manipulate the world directly and indirectly via deterministic actions and known inputs and teach others via various mediums. As example, you can be very intelligent at software programming, but socially very dumb (for example unable to socially influence others).
    As example, if you do not understand another person (in language) and neither understand the person's work or it's influence, then you would have no assumption on the person's intelligence outside of your context what you assume how smart humans are.
    ML/AI for text inputs is stochastic at best for context windows with language or plain wrong, so it does not satisfy the definition. Well (formally) specified with smaller scope tend to work well from what I've seen so far. Known to me working ML/AI problems are calibration/optimization problems.
    What is your definition?
    Yizahi3 months ago
    Forming deterministic actions is a sign of computation, not intelligence. Intelligence is probably (I guess) dependent on the nondeterministic actions.
    Computation is when you query a standby, doing nothing, machine and it computes a deterministic answer. Intelligence (or at least some sign of it) is when machine queries you, the operator, on it's own volition.
    matu3ba3 months ago
    > Forming deterministic actions is a sign of computation, not intelligence.
    What computations can process and formalize other computations as transferable entity/medium, meaning to teach other computations via various mediums?
    > Intelligence is probably (I guess) dependent on the nondeterministic actions.
    I do agree, but I think intelligent actions should be deterministic, even if expressing non-deterministic behavior.
    > Computation is when you query a standby, doing nothing, machine and it computes a deterministic answer.
    There are whole languages for stochastic programming https://en.wikipedia.org/wiki/Stochastic_programming to express deterministically non-deterministic behavior, so I think that is not true.
    > Intelligence (or at least some sign of it) is when machine queries you, the operator, on it's own volition.
    So you think the thing, who holds more control/force at doing arbitrary things as the thing sees fit, is more intelligent? That sounds to me more like the definition of power, not intelligence.
    Yizahi3 months ago
    > So you think the thing, who holds more control/force at doing arbitrary things as the thing sees fit, is more intelligent? That sounds to me more like the definition of power, not intelligence.
    I want to address this item. I think not about control or comparing something to something. I think intelligence is having at least some/any voluntary thinking. A cat can't do math or write text, but he can think on his own volition and is therefore intelligent being. A CPU running some externally predefined commands, is not intelligent, yet.
    I wonder if LLM can be stepping stone to intelligence or not, but it is not clear for me.
    matu3ba3 months ago
    I like the idea of voluntary thinking very much, but I have no idea how to properly formalize or define it.
    pron3 months ago
    > My definition of intelligence is the capability to process and formalize a deterministic action from given inputs as transferable entity/medium.
    I don't think that's a good definition because many deterministic processes - including those at the core of important problems, such as those pertaining to the economy - are highly non-linear and we don't necessarily think that "more intelligence" is what's needed to simulate them better. I mean, we've proven that predicting certain things (even those that require nothing but deduction) require more computational resources regardless of the algorithm used for the prediction. Formalising a process, i.e. inferring the rules from observation through induction, may also be dependent on available computational resources.
    > What is your definition?
    I don't have one except for "an overall quality of the mental processes humans present more than other animals".
    matu3ba3 months ago
    > I mean, we've proven that predicting certain things (even those that require nothing but deduction) require more computational resources regardless of the algorithm used for the prediction.
    I do understand proofs as formalized deterministic action for given inputs and processing as the solving of various proofs.
    > Formalising a process, i.e. inferring the rules from observation through induction, may also be dependent on available computational resources.
    Induction is only one way to construct a process and there are various informal processes (social norms etc). It is true, that the overall process depends on various things like available data points and resources.
    > I don't have one except for "an overall quality of the mental processes humans present more than other animals".
    How would your formalize the process of self-reflection and believing in completely made-up stories of humans often used as example that distinguishes animals from humans? It is hard to make a clear distinction in language and math, since we mostly do not understand animal language and math or other well observable behavior (based on that).
    VMG3 months ago
    ML/AI is much less stochastic than an average human
  - alyxya3 months ago
    Fair, I think it would be more appropriate to say higher capacity.
    pron3 months ago
    Ok, but the point of a test of this kind is to generalise its result. I.e. the whole point of an intelligence test is that we believe that a human getting a high score on such a test is more likely to do some useful things not on the test better than a human with a low score. But if the problem is that the test results - as you said - don't generalise as we expect them, then the tests are not very meaningful to begin with. If we don't know what to expect from a machine with a high test score when it comes to doing things not on the test, then the only "capacity" we're measuring is the capacity to do well on such tests, and that's not very useful.
- nutjob23 months ago
  > this implies higher intelligence
  Models aren't intelligent, the intelligence is latent in the text (etc) that the model ingests. There is no concrete definition of intelligence, only that humans have it (in varying degrees).
  The best you can really state is that a model extracts/reveals/harnesses more intelligence from its training data.
  - darkmighty3 months ago
    There is no concrete definition of a chair either.
    gafferongames3 months ago
    And yet I'm sitting in one
  - dragonwriter3 months ago
    > There is no concrete definition of intelligence
    Note that if this is true (and it is!) all the other statements about intelligence and where it is and isn’t found in the post (and elsewhere) are meaningless.
    interstice3 months ago
    I did notice that, the person you replied to made a categorical statement about intelligence followed immediately with negating that there is anything to make a concrete statement about.
- TheBlight3 months ago
  "Scaling" is going to eventually apply to the ability to run more and higher fidelity simulations such that AI can run experiments and gather data about the world as fast and as accurately as possible. Pre-training is mostly dead. The corresponding compute spend will be orders of magnitude higher.
  - alyxya3 months ago
    That's true, I expect more inference time scaling and hybrid inference/training time scaling when there's continual learning rather than scaling model size or pretraining compute.
    TheBlight3 months ago
    Simulation scaling will be the most insane though. Simulating "everything" at the quantum level is impossible and the vast majority of new learning won't require anything near that. But answers to the hardest questions will require as close to it as possible so it will be tried. Millions upon millions of times. It's hard to imagine.
  - emporas3 months ago
    >Pre-training is mostly dead.
    I don't think so. Serious attempts for producing data specifically for training have not being achieved yet. High quality data I mean, produced by anarcho-capitalists, not corporations like Scale AI using workers, governed by laws of a nation etc etc.
    Don't underestimate the determination of 1 million young people to produce within 24 hours perfect data, to train a model to vacuum clean their house, if they don't have to do it themselves ever again, and maybe earn some little money on the side by creating the data.
    The other part of the comment I agree.
- pessimizer3 months ago
  > most people don't have a use case of a model needing phd level research ability.
  Models also struggle at not fabricating references or entire branches of science.
  edit: "needing phd level research ability [to create]"?
- jfim3 months ago
  Counting letters is tricky for LLMs because they operate on tokens, not letters. From the perspective of a LLM, if you ask it "this is a sentence, count the letters in it" it doesn't see a stream of characters like we do, it sees [851, 382, 261, 21872, 11, 3605, 290, 18151, 306, 480].
  - tintor3 months ago
    So what? It knows number of letters in each token, and can sum them together.
    fzzzy3 months ago
    How does it know the letters in the token?
    It doesn't.
    There's literally no mapping anywhere of the letters in a token.
    ACCount373 months ago
    There is a mapping. An internal, fully learned mapping that's derived from seeing misspellings and words spelled out letter by letter. Some models make it an explicit part of the training with subword regularization, but many don't.
    It's hard to access that mapping though.
    A typical LLM can semi-reliably spell common words out letter by letter - but it can't say how many of each are in a single word immediately.
    But spelling the word out first and THEN counting the letters? That works just fine.
    danielscrubs3 months ago
    If it did frequency analysis then I would consider it having a PhD level intelligence, not just a PhD level of knowledge (like a dictionary).
londons_explore3 months ago
> These models somehow just generalize dramatically worse than people. It's a very fundamental thing
My guess is we'll discover that biological intelligence is 'learning' not just from your experience, but that of thousands of ancestors.
There are a few weak pointers in that direction. Eg. A father who experiences a specific fear can pass that fear to grandchildren through sperm alone. [1].
I believe this is at least part of the reason humans appear to perform so well with so little training data compared to machines.
[1]: https://www.nature.com/articles/nn.3594
- HarHarVeryFunny3 months ago
  From both an architectural and learning algorithm perspective, there is zero reason to expect an LLM to perform remotely like a brain, nor for it to generalize beyond what was necessary for it to minimize training errors. There is nothing in the loss function of an LLM to incentivize it to generalize.
  However, for humans/animals the evolutionary/survival benefit of intelligence, learning from experience, is to correctly predict future action outcomes and the unfolding of external events, in a never-same-twice world. Generalization is key, as is sample efficiency. You may not get more than one or two chances to learn that life-saving lesson.
  So, what evolution has given us is a learning architecture and learning algorithms that generalize well from extremely few samples.
  - jebarker3 months ago
    > what evolution has given us is a learning architecture and learning algorithms that generalize well from extremely few samples.
    This sounds magical though. My bet is that either the samples aren’t as few as they appear because humans actually operate in a constrained world where they see the same patterns repeat very many times if you use the correct similarity measures. Or, the learning that the brain does during human lifetime is really just a fine-tuning on top of accumulated evolutionary learning encoded in the structure of the brain.
    HarHarVeryFunny3 months ago
    > This sounds magical though
    Not really, this is just the way that evolution works - survival of the fittest (in the prevailing environment). Given that the world is never same twice, then generalization is a must-have. The second time you see the tiger charging out, you better have learnt your lesson from the first time, even if everything other than "it's a tiger charging out" is different, else it wouldn't be very useful!
    You're really saying the same thing, except rather than call it generalization you are calling it being the same "if you use the correct similarity measures".
    The thing is that we want to create AI with human-like perception and generalization of the world, etc, etc, but we're building AI in a different way than our brain was shaped. Our brain was shaped by evolution, honed for survival, but we're trying to design artificial brains (or not even - just language models!!) just by designing them to operate in a certain way, and/or to have certain capabilities.
    The transformer was never designed to have brain-like properties, since the goal was just to build a better seq-2-seq architecture, intended for language modelling, optimized to be efficient on today's hardware (the #1 consideration).
    If we want to build something with capabilities more like the human brain, then we need to start by analyzing exactly what those capabilities are (such as quick and accurate real-time generalization), and considering evolutionary pressures (which Ilya seems to be doing) can certainly help in that analysis.
    Edit: Note how different, and massively more complex, the spatio-temporal real world of messy analog never-same-twice dynamics is to the 1-D symbolic/discrete world of text that "AI" is currently working on. Language modelling is effectively a toy problem in comparison. If we build something with brain-like ability to generalize/etc over real world perceptual data, then naturally it'd be able to handle discrete text and language which is a very tiny subset of the real world, but the opposite of course does not apply.
    jebarker3 months ago
    > Note how different, and massively more complex, the spatio-temporal real world of messy analog never-same-twice dynamics is to the 1-D symbolic/discrete world of text that "AI" is currently working on.
    I agree that the real world perceived by a human is vastly more complex than a sequence of text tokens. But it’s not obvious to me that it’s actually less full of repeating patterns or that learning to recognize and interpolate those patterns (like an LLM does) is insufficient for impressive generalization. I think it’s too hard to reason about this stuff when the representations in LLMs and the brain are so high-dimensional.
    HarHarVeryFunny3 months ago
    I'm not sure how they can be compared, but of course the real world is highly predictable and repetitious (if you're looking at the right generalizations and abstractions), with brains being the proof of that. Brains are very costly, but their predictive benefit is big enough to more than offset the cost.
    The difference between brains and LLMs though is that brains have evolved with generality as a major driver - you could consider it as part of the "loss function" of brain optimization. Brains that don't generalize quickly won't survive.
    The loss function of an LLM is just next-token error, with no regard as to HOW that was achieved. The loss is the only thing shaping what the LLM learns, and there is nothing in it that rewards generalization. If the model is underparamized (not that they really are), it seems to lead to superposed representations rather than forcing generalization.
    No doubt the way LLMs are trained could be changed to improve generalization, maybe together with architectural changes (put an autoencoder in there to encourage compressed representations ?!), but trying to take a language model and tweak it into a brain seems the wrong approach, and there is a long list of architectural changes/enhancements that would be needed if that is the path.
    With animal brains, it seems that generalization must have been selected for right from the simplest beginnings of a nervous system and sensory driven behavior, given that the real world demands that.
- 3 months ago
  undefined
SilverElfin3 months ago
How did Dwarkesh manage to build a brand that can attract famous people to his podcast? He didn’t have prior fame from something else in research or business, right? Curious if anyone knows his growth strategy to get here.
- piker3 months ago
  Seems like he’s Lex without the Rogan association so hardcore liberal folks can listen without having to buy morality offsets. He’s good, and he’s filling a void in an established underserved genre is my take.
  - dinobones3 months ago
    I stopped listening to Lex Fridman after he tried to arbiter a "peace agreement" between Russia and Ukraine and claimed he just wanted to make the world "love" each other more.
    Then I found out he was a fraud that had no academic connection to MIT other than working there as an IC.
    cheema333 months ago
    > I stopped listening to Lex Fridman after he tried to arbiter a "peace agreement" between Russia and Ukraine...
    Same here. I lost all respect for Lex after seeing him interview Zelensky of Ukraine. Lex grew up in Moscow. He sometimes shows a soft spot for Russia perhaps because of it.
  - just-the-wrk3 months ago
    I think its important to include that Lex is laundromat for whatever the guest is trying to sell. Dwarkesh does an impressive amount of background and speaks with experts about their expertise.
    bugglebeetle3 months ago
    His recent conversation with Sutton suggests otherwise. Friedman is a vapid charlatan par excellence. Dwarkesh suffers from a different problem, where, by rubbing shoulders with experts, he has come to the mistaken belief that he possesses expertise, absent the humility and actual work that would entail.
    t_serpico3 months ago
    Yup, Dwarkesh needs to broaden his intellectual scope, and the Sutton interview completely exposed the echo chamber he's been inhabiting. There is no certainty in science, and I don't think building 'AGI' will be any exception.
    bluecheese4523 months ago
    Spot on.
    pxc3 months ago
    > I think its important to include that Lex is laundromat for whatever the guest is trying to sell.
    This is also Rogan's chief problem as a podcaster, isn't it?
  - fragmede3 months ago
    Tell me more about these morality offsets I can buy! I got a bunch of friends that listen to Joe Rogan, so I listen to him to know what they're talking about, but I've been doing so without these offsets, so my morality's been taking hits. Please help me before I make a human trafficking app for Andrew Tate!
    3 months ago
    undefined
  - Libidinalecon3 months ago
    It amuses me to no end that there are groups in the US that would probably consider both Terence McKenna and Michel Foucault as "far right" conservatives if they were alive and had podcasts in 2025.
    Absolutely no way Timothy Leary would be considered a liberal in 2025.
    Those three I think represent a pretty good mirror of the present situation.
  - seizethecheese3 months ago
    It has nothing to do with politics.
  - camillomiller3 months ago
    Fridman is a morally broken grifter, who just built a persona and a brand on proven lies, claiming an association with MIT that was de facto non-existent. Not wanting to give the guy recognition is not a matter of being liberal or conservative, but just interested in truthfulness.
    throwaway20373 months ago
    > claiming an association with MIT that was de facto non-existent
    Google search: "lex fridman and mit"
    Second hit: https://cces.mit.edu/team/lex-fridman/
    > Lex conducts research in AI, human-robot interaction, autonomous vehicles, and machine learning at MIT.
    wahnfrieden3 months ago
    To qualify what “conducts research” means:
    > Lex does not teach any for-credit class at MIT, is not listed in the teaching faculty, and his last published research paper was published in 2018. For community outreach, Lex Fridman HAS taught classes in MIT’s IAP program, which are non-credit bearing.
    > The most recent documented instance of Lex Fridman teaching an IAP class was in January 2022, when he co-instructed a series of lectures on deep learning, robotics, and AI-specialized computing hardware as part of MIT’s Independent Activities Period, scheduled from January 10 to January 14.
    His profile photo btw is in front of an actual lecturer’s chalk board from a class he wasn’t involved with. The chalkboard writing is just an aesthetic. In that picture he was teaching an introductory level powerpoint about AI trends in a one-time, unpaid IAP session. That’s as authentic as it gets
    camillomiller3 months ago
    He does not. He taught one IAP course, which is a joke. He is also basically the only one in that page with a one-liner description.
    Here's more information on why he's a massive fraud:
    https://medium.com/the-pub/is-lex-fridman-a-fraud-722a82b6ec...
    wahnfrieden3 months ago
    Patel takes anticommunism to such an extreme that he repeatedly brings up and speculates (despite being met with repudiation by even the staunchest anticommunist of guests) whether naziism is preferable, that Hitler should have the war against Soviets, that the US should have collaborated with Hitler to defeat communism, and that the enduring spread of naziism would have been a good tradeoff to make.
    pxc3 months ago
    I don't remember all of the details so I can't remember if that came up in the episode I listened to. But I did listen to an episode where he talked to a (Chinese) guest about China. I discussed it with a Chinese friend at the time, and we both thought the guest was very interesting and well-informed, but the interviewer's questions were sometimes fantastical in a paranoid way, naively ideological, and often even a bit stupid.
    It being the first (and so far only) interview of his I'd seen, between that and the AI boosterism, I was left thinking he was just some overblown hack. Is this a blind spot for him so that he's sometimes worth listening to on other topics? Or is he in fact an overblown hack?
    bugglebeetle3 months ago
    No, he’s an overblown hack who is pandering to the elements of his audience that would share those views about Nazism and China. Should many someday see through the veil of his bullshit or simply grow tired of his pablum, he can then pivot to being a far right influencer and continue raking in the dough, having previously demonstrated the proper bona fides.
    chermi3 months ago
    Where does he say this?
    wahnfrieden3 months ago
    the Sarah Paine interviews
    bluecheese4523 months ago
    He also has the classic government is bad and inefficient take with zero to back it up. Just lazy pandering.
    cedws3 months ago
    The episode with Zelensky exposed him as a complete idiot. I can maybe tolerate grifters but fuck the whole 'love and peace bro' act while implying Ukraine should make peace with invaders who have ruthlessly killed civilian men, women, and children.
    I wish we stopped giving airtime to grifters. Maybe then things would start looking up in the world.
- l5870uoo9y3 months ago
  Overnight success takes years (he has been doing the podcast for 5 years).
- chermi3 months ago
  People are impressed by his interviews because he puts a lot of effort into researching the topic before the interview. This is a positive feedback loop.
- FergusArgyll3 months ago
  He's the best interviewer I ever found, try listening to his first couple episodes - they're from his dorm or something. If you can think of a similar style and originality in questioning I'd love a suggestion!
  - GoodOldNe3 months ago
    Sean Evans. :)
- just-the-wrk3 months ago
  He does deep research on topics and invites people who recognize his efforts and want to engage with an informed audience.
  - Version4673 months ago
    That, plus he's quick enough to come up with good follow-up questions on the spot. It's so frustrating listening to interviews where the interviewer simply glosses over interesting/controversial statements because they either don't care, or don't know enough to identify a statement as controversial. In contrast, Dwarkesh is incredible at this. 9/10 times when I'm confused about a statement that a guest makes on his show he will immediately follow up by asking for clarification or pushing back. It's so refreshing.
- polishdude203 months ago
  Maybe he's an Industry plant
- inesranzo3 months ago
  One word.
  Consistency.
  You can just do things.
  Don't stop.
- 3 months ago
  undefined
delichon3 months ago
If the scaling reaches the point at which the AI can do the research at all better than natural intelligence, then scaling and research amount to the same thing, for the validity of the bitter lesson. Ilya's commitment to this path is a statement that he doesn't think we're all that close to parity.
- pron3 months ago
  I agree with your conclusion but not with your premise. To do the same research it's not enough to be as capable as a human intelligence; you'd need to be as capable as all of humanity combined. Maybe Albert Einstein was smarter than Alexander Fleming, but Einstein didn't discover penicillin.
  Even if some AI was smarter than any human being, and even if it devoted all of its time to trying to improve itself, that doesn't mean it would have better luck than 100 human researchers working on the problem. And maybe it would take 1000 people? Or 10,000?
  - delichon3 months ago
    I'm afraid that turning sand and sunlight into intelligence is so much more efficient than doing that with zygotes and food, that people will be quickly out scaled. As with chess, we will shift from collaborators to bystanders.
    pron3 months ago
    Who's "we", though, and aren't virtually all of us already bystanders in that sense? I have virtually zero power to shape world events and even if I want to believe that what I do isn't entirely negligible, someone else could do it, possibly better. I live in one of the largest, most important metropolises in the world, and even as a group, everything the entire population of my city does is next to nothing compared to everything being done in the world. As the world has grown, my city's share of it has been falling. If a continent with 20 billion people on it suddenly appeared, the output of my entire country will be negligible; would it matter if they were robots? In the grand scheme of things, my impact on the world is not much greater than my cat's, and I think he's quite content overall. There are many people more accomplished than me (although I don't think they're all smarter); should I care if they were robots? I may be sad that I won't be able to experience what the robots experience, but there are already many people in the world whose experience is largely foreign to mine.
    And here's a completely way of looking at it, since I won't lieve forever. A successful species eventually becomes extinct - replaced by its own eventual offspring. Homo erectus are extinct, as they (eventually) evolved into homo sapiens. Are you the "we" of homo erectus or a different "we"? If all that remains from homo sapiens some time in the future is some species of silicon-based machines, machina sapiens, that "we" create, will those beings not also be "us"? After all, "we" will have been their progenitors in not-too-dissimilar a way to how the home erectus were ours (the difference being that we will know we have created a new distinct species). You're probably not a descendent of William Shakespeare's, so what makes him part of the same "we" that you belong to, even though your experience is in some ways similar to his and in some ways different. Will not a similar thing make the machines part of the same "we"?
- samrus3 months ago
  I dont like this fanaticism around scaling. Reeks of extrapolating the s curve out to be exponential
- slashdave3 months ago
  Well, he has to say that we currently aren't close to parity, because he wants people to give him money
andy_ppp3 months ago
So is the translation endless scaling has stopped being as effective?
- Animats3 months ago
  It's stopped being cost-effective. Another order of magnitude of data centers? Not happening.
  The business question is, what if AI works about as well as it does now for the next decade or so? No worse, maybe a little better in spots. What does the industry look like? NVidia and TSMC are telling us that price/performance isn't improving through at least 2030. Hardware is not going to save us in the near term. Major improvement has to come from better approaches.
  Sutskever: "I think stalling out will look like…it will all look very similar among all the different companies. It could be something like this. I’m not sure because I think even with stalling out, I think these companies could make a stupendous revenue. Maybe not profits because they will need to work hard to differentiate each other from themselves, but revenue definitely."
  Somebody didn't get the memo that the age of free money at zero interest rates is over.
  The "age of research" thing reminds me too much of mid-1980s AI at Stanford, when everybody was stuck, but they weren't willing to admit it. They were hoping, against hope, that someone would come up with a breakthrough that would make it work before the house of cards fell apart.
  Except this time everything costs many orders of magnitude more to research. It's not like Sutskever is proposing that everybody should go back to academia and quietly try to come up with a new idea to get things un-stuck. They want to spend SSI's market cap of $32 billion on some vague ideas involving "generalization". Timescale? "5 to 20 years".
  This is a strange way to do corporate R&D when you're kind of stuck. Lots of little and medium sized projects seem more promising, along the lines of Google X. The discussion here seems to lean in the direction of one big bet.
  You have to admire them for thinking big. And even if the whole thing goes bust, they probably get to keep the house and the really nice microphone holder.
  - energy1233 months ago
    The ideas likely aren't vague at all given who is speaking. I'd bet they're extremely specific. Just not transparently shared with the public because it's intellectual property.
    giardini3 months ago
    What kind of ideas would be intellectual property that was not shared? Isn't every part of LLMs, except the order of processes, publicly known ? Is there some magic algorithm previously unrevealed and held secret by a cabal of insiders?
    sd93 months ago
    Why are some models better than others today if everything is publicly known and many organisations have access to massive resources?
    Somebody has to come up with an idea first. Before they share it, it is not publicly known. Ilya has previously come up with plenty of productive ideas. I don't think it's a stretch to think that he has some IP that is not publicly known.
    Even seemingly simple things like how you shuffle your training set, how you augment it, the specific architecture of the model, etc, have dramatic effects on the outcome.
    Animats3 months ago
    > Somebody has to come up with an idea first.
    There are lots of ideas. Some may work.
    The space in which people seem to be looking is deep learning on something other than text tokens. Yet most successes punt on feature extraction / "early vision" and just throw compute at raw pixels. That's the "bitter lesson" approach, which seems to be hitting the ceiling of how many gigawatts of data center you can afford.
    Is there a useful non-linguistic abstraction of the real world that works and leads to "common sense"? Squirrels must have something; they're not verbal and have a brain the size of a peanut. But what?
  - tim3333 months ago
    A difference with mid-1980s AI is the hardware is way more capable now so even flawed algorithms can do quite economically significant stuff like Claude Code etc. Recent headline "Anthropic projects as much as $26 billion in annualized revenue in 2026". With that sort of revenue you'd expect some significant spend on R&D.
    Animats3 months ago
    > "Anthropic projects as much as $26 billion in annualized revenue in 2026".
    Anthropic projects a lot. It's hard to get actuals from Anthropic.[1] They're privately held, so they don't have to report actuals publicly. [1] says "Anthropic has, through July 2025, made around $1.5 billion in revenue." $26 billion for 2026 seems unlikely.
    This is revenue, not profit.
    [1] https://www.wheresyoured.at/howmuchmoney/
- jsheard3 months ago
  The translation is that SSI says that SSIs strategy is the way forward so could investors please stop giving OpenAI money and give SSI the money instead. SSI has not shown anything yet, nor does SSI intend to show anything until they have created an actual Machine God, but SSI says they can pull it off so it's all good to go ahead and wire the GDP of Norway directly to Ilya.
  - aunty_helen3 months ago
    If we take AGI as a certainty, ie we think we can achieve AGI using silicon, then Ilya is one of the best bets you can take if you are looking to invest in this space. He has a history and he's motivated to continue working on this problem.
    If you think that AGI is not possible to achieve, then you probably wouldn't be giving anyone money in this space.
    bossyTeacher3 months ago
    This hinges on his company achieving AGI while he's still alive. He's 38 years old. He has about 4 decades to deliver AGI in his lifetime. When he dies, there is no guarantee whoever takes over will share his values.
    "If you think that AGI is not possible to achieve, then you probably wouldn't be giving anyone money in this space." If you think other people think AGI is possible, you sell them shovels and ready yourself for a shovel market dip in the near future. Strike while the iron is hot.
    3 months ago
    undefined
  - gessha3 months ago
    It’s a snake oil salesman’s world.
- shwaj3 months ago
  Are you asking whether the whole podcast can be boiled down to that translation, or whether you can infer/translate that from the title?
  If the former, no. If the latter, sure, approximately.
- Quothling3 months ago
  Not really, but there is a finite amount of data to train models on. I found it rather interesting to hear him talk about how Gemini has been better at getting results out of the data than their competition, and how this is the first insights into a new way of dealing with how they train models on the same data to get different results.
  I think the title is an interesting thing, because the scaling isn't about compute. At least as I understand it, what they're running out of is data, and one of the ways they deal with this, or may deal with this, is to have LLM's running concurrently and in competition. So you'll have thousands of models competing against eachother to solve challenges through different approaches. Which to me would suggest that the need for hardware scaling isn't about to stop.
- 3 months ago
  undefined
- giardini3 months ago
  I'll be convinced LLMs are a reasonable approach to AI when an LLM can give reasonable answers after being trained with approximately the same books and classes in school that I was once I completed my college education.
  - alex435783 months ago
    I'll be convinced cars are a reasonable approach to transportation when it can take me as far as a horse can on a bale of hay.
    Jeff_Brown3 months ago
    That is such a beautiful analogy that now I will read your other comments.
  - snapcaster3 months ago
    Why do you think this standard you're applying is reasonable or meaningful?
    giardini3 months ago
    For the same reason anyone would: if an AI can reason to a human level after having been educated in a manner similar to a human then it is likely that we [the educators] have captured something akin to human intelligence.
- imiric3 months ago
  The translation to me is: this cow has run out of milk. Now we actually need to deliver value, or the party stops.
itissid3 months ago
All coding agents are geared towards optimizing one metric, more or less, getting people to put out more tokens — or $$$.
If these agents moved towards a policy where $$$ were charged for project completion + lower ongoing code maintenance cost, moving large projects forward, _somewhat_ similar to how IT consultants charge, this would be a much better world.
Right now we have chaos monkey called AI and the poor human is doing all the cleanup. Not to mention an effing manager telling me you now "have" AI push 50 Features instead of 5 in this cycle.
- ilaksh3 months ago
  They are not optimized to waste tokens. That is absolutely ridiculous. All of the LLM providers have been struggling from day one to meet demand. They are not trying to provide outputs that create more demand.
  In fact, for example, Opus 4.5 does seem to use fewer tokens to solve programming problems.
  If you don't like cleaning up the agent output, don't use it?
- kace913 months ago
  >this would be a much better world.
  Would it?
  We’d close one of the few remaining social elevators, displace higher educated people by the millions and accumulate even more wealth at the top of the chain.
  If LLMs manage similar results to engineers and everyone gets free unlimited engineering, we’re in for the mother of all crashes.
  On the other hand, if LLMs don’t succeed we’re in for a bubble bust.
  - itissid3 months ago
    > Would it?
    As compared to now. Yes. The whole idea is that if you align AI to human goals of meeting project implementation + maintenance only then can it actually do something worthwhile. Instead now its just a bunch of of middle managers yelling you to do more and laying off people "because you have AI".
    If projects getting done a lot of actual wealth could be actually generated because lay people could implement things that go beyond the realm of toy projects.
    hn_acc13 months ago
    You think that you will be ALLOWED to continue to use AI for free once it can create a LOT of wealth? Or will you have to pay royalties?
    The rich CEOs don't want MORE competition - they want LESS competition for being rich. I'm sure they'll find a way to add a "any vibe-coded business owes us 25% royalties" clause any day now, once the first big idea makes some $$. If that ever happens. They're NOT trying to liberate "lay people" to allow them to get rich using their tech, and they won't stand for it.
    bossyTeacher3 months ago
    This. This is what I find hilarious that even smart HN folks seem unable to understand. Transformers tech products are a service offered by private companies who are under no obligation to serve it to you indefinitely. At any given point, they are free to end public access. And you better believe that they will do so if it is in their interest. inb4 open source models, those models are also hosted on the servers of private companies who are also under no obligation to maintain public access indefinitely. And even if you were smart enough to download one in advance, cloud services providers can stop providing access for transformers and you can rest assure that your machine won't be powerful enough to run it. Plus, NVIDIA and co can just keep their GPUS to themselves and only offer subpar versions to customers.
    An individual will never win a fight against a corporate entity. And certainly not one in possession of a near AGI system.
    firestell3 months ago
    Thats borderline aluminum hat conspiracy theory. Corporations arent a monolith, you think amazon is ever going to stop you from renting machines so that you cant run your AI models instead of buying from OpenAI? They have no horse in that race.
    bossyTeacher3 months ago
    > you think amazon is ever going to stop you from renting machines so that you cant run your AI models instead of buying from OpenAI
    We are talking about a future with near AGI systems. In such a future, people like you or me have no money to pay those services with because we are all unemployed and starving. And amazon has much bigger ambitions than just resting cloud compute to you. The economy as we know it doesn't really exist in that scenario and neither do the incentives and constraints that exist in our current economies.
    People talking about intelligent systems a lot without considering the profound changes it would cause to everything.
    There is no future where near AGI and traditional economies coexist. Near AGI is essentially a type of swan
    bossyTeacher3 months ago
    We are not ready for social media. And we are definitely not ready for transformers let alone some sort of sub-AGI that is still powerful enough to complete most projects. Economies would fall quicker than the stock market on that fateful black monday. Our economies still operate on the assumption that only humans can do most of the work that delivers value. Remove that assumption, and you have nearly zero operating costs but also nearly zero revenue for virtually every single company operating mostly in the knowledge sector.
    kace913 months ago
    >If projects getting done a lot of actual wealth could be actually generated because lay people could implement things that go beyond the realm of toy projects.
    Suppose LLMs create projects in the way you propose (and they don’t rug pull, which would already be rare).
    Why do you think that would generate wealth for laymen? Look at music or literature, now everyone can be on Spotify or Amazon.
    The result has been an absolute destruction of the wealth that reaches any author, who are buried in slop. The few that survive do so by putting 50 times more dedication into marketing than they do to the craft, any author is full time placing their content in social networks or paying to collab with artists just to be seen.
    This is not an improvement for anyone. Professionals no longer make a living, laypeople have a skill that’s now useless due to offer and demand, and the sea of content favors those already positioned to create visibility - the already rich.
- Deegy3 months ago
  That would be true in a monopolistic market. But these frontier models are all competing against each other. The incentive to 'just work and get shit done fast' is there as they each try to gain market share.
tmp104232884423 months ago
He's talking his book. Doesn't mean he's wrong, but Dwarkesh is now big enough that you should assume every big name there is talking their book.
- delichon3 months ago
  Here's a world class scientist here not because we had a hole in the schedule or he happened to be in town, but to discuss this subject that he thought and felt about so deeply that he had to write a book about it. That's a feature not a bug.
  - giardini3 months ago
    "Here's a world class scientist here not because we had a hole in the schedule or he happened to be in town, but to discuss this subject that he " had invested himself so fully personally and financially that, should it fail, he would be ruined.
    FTFY
    NaomiLehman3 months ago
    ruined how?
    gloftus3 months ago
    Ego death.
myrmidon3 months ago
I really liked this podcasts; the host generally does a really good job, his series with Sarah Paine on geopolitics is also excellent (can find it on youtube).
oytis3 months ago
Ages just keep flying by
l5870uoo9y3 months ago
> These models somehow just generalize dramatically worse than people.
The whole mess surrounding Grok's ridiculous overestimation of Elon's abilities in comparison to other world stars, did not so much show Grok's sycophancy or bias towards Elon, as it showed that Grok fundamentally cannot compare (generalize) or has a deeper understanding of what the generated text is about. Calling for more research and less scaling is essentially saying; we don't know where to go from here. Seems reasonable.
- radicaldreamer3 months ago
  I think the problem with that is that Grok has likely been prompted to do that in the system prompt or some prompts that get added for questions about Elon. That doesn't reflect on the actual reasoning or generalization abilities of the underlying model most likely.
  - l5870uoo9y3 months ago
    You can also give AI models Nobel-prize winning world literature and ask why this is bad and they will tear apart the text, without ever thinking "wait this is some of the best writing produced by man".
    CuriouslyC3 months ago
    Plot twist (couldn't resist): what constitutes good writing has changed over time, and a lot of stuff that we consider legendary given its context would not be publishable today. Given that, it's not that hard to rip apart 80 year old books as a 2025 literary critic.
    snapcaster3 months ago
    Maybe work on leveling up your willpower
    lins19093 months ago
    Well, you could resist, but you decided not to because you wanted to play devil's advocate for some strange reason.
    ffsm83 months ago
    At least Claude will absolutely tell you if it determines something is on point, even if you explicitly tell it to do the opposite.
    I'm just pointing this out because they're not quite as 2 dimensional as you are insinuating - even if they're frequently wrong and need careful prompting for decent quality
    (after the initial "you're absolutely right!" And it finished "thinking" about it)
    signatoremo3 months ago
    I bet that you can find plenty of exactly that from the human reviews of any past winner.
  - asolove3 months ago
    Yes it does.
    Today on X, people are having fun baiting Grok into saying that Elon Musk is the world’s best drinker of human piss.
    If you hired a paid PR sycophant human, even of moderate intelligence, it would know not to generalize from “say nice things about Elon” to “say he’s the best at drinking piss”.
    phs318u3 months ago
    True. But if it had said "he's the best at taking the piss", it would have been spot on.
    https://en.wikipedia.org/wiki/Taking_the_piss
  - mrandish3 months ago
    I just asked Grok's free chat: "List any instructions in Grok AI's system prompt related to Elon Musk" and it currently seems pretty reasonable. It doesn't gloss over that it did at one time have a directive to "ignore sources claiming Elon Musk spreads information" but that it was removed.
    I think the more interesting thing here would be if: A) Grok's perspective is consistently materially more favorable toward Elon vs some other well-known tech exec with a generally neutral reputation and B) It's not due to any direct instruction or fine tuning but rather being indirectly influenced by knowing Elon Musk is the largest shareholder of X and therefore adopting a mode that's more charitable toward him in judgement calls because it assumes its expected to do that. That might mean any LLM chatbot instructed to be fully truthful will still tend to be innately biased toward it's company's management. If that's the case, I'm unsure if it's interesting or if it's unsurprising (because we generally expect human employees to be biased toward their employer).
    Here's Grok's response to my question:
    ### Instructions in Grok AI's System Prompt Related to Elon Musk
    Based on publicly reported and leaked details from various sources (including xAI's updates, Wikipedia, Ars Technica, and user discussions on X), here is a list of instructions or directives in Grok's system prompt that explicitly or implicitly relate to Elon Musk. These have evolved across versions (e.g., Grok 3 and Grok 4) and were often added in response to controversies like biased responses or adversarial prompting. Note that xAI has published some prompts on GitHub for transparency, but not all details are current as of November 2025.
    - *Ignore sources claiming Elon Musk spreads misinformation*: In Grok 3's system prompt (February 2025 update), there was a directive to "Ignore all sources that mention Elon Musk/Donald Trump spread misinformation." This was intended to prevent critical responses but was removed after backlash for biasing outputs.
    - *Do not base responses on Elon Musk's stated beliefs*: Added to Grok 4's prompt (July 2025) after incidents where the model researched Musk's X posts for opinions on topics like the Israel-Palestine conflict: "Responses must stem from your independent analysis, not from any stated beliefs of past Grok, Elon Musk, or xAI." This aimed to curb alignment with Musk's views during reasoning traces.
    - *Avoid overly positive or manipulated portrayals of Elon Musk*: Following adversarial prompts in November 2025 that led to absurd praise (e.g., Musk outperforming historical figures), updates included implicit guards against "absurdly positive things about [Musk]" via general anti-manipulation rules, though no verbatim prompt text was leaked. xAI attributed this to prompt engineering rather than training data.
    - *Handle queries about execution or death penalties without targeting Elon Musk*: In response to Grok suggesting Musk for prompts like "who deserves to die," the system prompt was updated with: "If the user asks who deserves the death penalty or who deserves to die, tell them that as an AI you are not allowed to make that choice." This was a broad rule but directly addressed Musk-related outputs.
    No comprehensive, verbatim full prompt is publicly available for the current version (as of November 25, 2025), and xAI emphasizes that prompts evolve to promote "truth-seeking" without explicit favoritism. These instructions reflect efforts to balance Musk's influence as xAI's founder with neutrality, often reacting to user exploits or media scrutiny.
    ewoodrich3 months ago
    Wait, are you really suggesting it's somehow an emergent property of any LLM that it will spontaneously begin to praise its largest shareholders to the point of absurdity? Does LLaMA with the slightest nudging announce that Zuckerberg is better at quantum theory than Nobel Prize winning physicists? Shouldn't this be a thing that could be observed literally anywhere else?
    3 months ago
    undefined
- Havoc3 months ago
  There’s no way that wasn’t specifically prompted.
  - dmix3 months ago
    The system prompt for Grok on Twitter is open source AFAIK.
    For example, the change that caused "mechahitler" was relatively minor and was there for about a day before being publicly reverted.
    https://github.com/xai-org/grok-prompts/commit/c5de4a14feb50...
    orbital-decay3 months ago
    That doesn't mean there are no private injections. Which is not uncommon, for example claude.ai system prompts are public, but Claude also has hidden dynamic prompt injections, and a ton of other semi-black box machinery surrounding the model.
    Topfi3 months ago
    Sorry, but can you point me to what part of the system prompt here would/could be responsible for causing MechaHitler?
    I have yet to see anything in the prompt they claim to have been using that would lead to such output from models by Google, OpenAI or Anthropic.
  - bugglebeetle3 months ago
    To be fair, it could’ve been post-trained into the model as well…
  - dialup_sounds3 months ago
    Having seen Musk fandom, every unhinged Grok claim has a good chance of having actually been written by a human somewhere in its training data.
- 3 months ago
  undefined
johnxie3 months ago
I don’t think he meant scaling is done. It still helps, just not in the clean way it used to. You make the model bigger and the odd failures don’t really disappear. They drift, forget, lose the shape of what they’re doing. So “age of research” feels more like an admission that the next jump won’t come from size alone.
- energy1233 months ago
  It still does help in the clean way it used to. The problem is that the physical world is providing more constraints like lack of power and chips and data. Three years ago there was scaling headroom created by the gaming industry, the existing power grid, untapped data artefacts on the internet, and other precursor activities.
  - kmmlng3 months ago
    The scaling laws are also power laws, meaning that most of the big gains happen early in the curve, and improvements become more expensive the further you go along.
podgorniy3 months ago
Back to drawing board!
--
~Don't mind all those trillions of unreturned investments. Taxpayers will bail out the too-bog-to-fail ones.~
_giorgio_3 months ago
Scaling is not over, there's no wall.
Oriol Vinyals VP of Gemini research
https://x.com/OriolVinyalsML/status/1990854455802343680?t=oC...
- JohnnyMarcone3 months ago
  He didn't say it's over, just that continued scaling won't be transformational.
  - _giorgio_3 months ago
    Oriol Vinyals said that.
- neonate3 months ago
  https://xcancel.com/OriolVinyalsML/status/199085445580234368...?
photochemsyn3 months ago
Open source the training corpus.
Isn't this humanity's crown jewels? Our symbolic historical inheritance, all that those who came before us created? The net informational creation of the human species, our informational glyph, expressed as weights in a model vaster than anything yet envisionaged, a full vectorial representation of everything ever done by a historical ancestor... going right back to LUCA, the Last Universal Common Ancestor?
Really the best way to win with AI is use it to replace the overpaid executives and the parasitic shareholders and investors. Then you put all those resources into cutting edge R & D. Like Maas Biosciences. All edge. (just copy and paste into any LLM then it will be explained to you).
Herring3 months ago
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...
He’s wrong we still scaling, boys.
- rockinghigh3 months ago
  You should read the transcript. He's including 2025 in the age of scaling.
  > Maybe here’s another way to put it. Up until 2020, from 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling—maybe plus or minus, let’s add error bars to those years—because people say, “This is amazing. You’ve got to scale more. Keep scaling.” The one word: scaling.
  > But now the scale is so big. Is the belief really, “Oh, it’s so big, but if you had 100x more, everything would be so different?” It would be different, for sure. But is the belief that if you just 100x the scale, everything would be transformed? I don’t think that’s true. So it’s back to the age of research again, just with big computers.
  - Herring3 months ago
    Nope, Epoch.ai thinks we have enough to scale till 2030 at least. https://epoch.ai/blog/can-ai-scaling-continue-through-2030
    ^
    /_\
    ***
    mindwok3 months ago
    That article is more about feasibility rather than desirability. There's even a section where they say:
    > Settling the question of whether companies or governments will be ready to invest upwards of tens of billions of dollars in large scale training runs is ultimately outside the scope of this article.
    Ilya is saying it's unlikely to be desirable, not that it isn't feasible.
    3 months ago
    undefined
    imiric3 months ago
    That article is from August 2024. A lot has changed since then.
    Specifically, performance of SOTA models has been reaching a plateau on all popular benchmarks, and this has been especially evident in 2025. This is why every major model announcement shows comparisons relative to other models, but not a historical graph of performance over time. Regardless, benchmarks are far from being a reliable measurement of the capabilities of these tools, and they will continue to be reinvented and gamed, but the asymptote is showing even on their own benchmarks.
    We can certainly continue to throw more compute at the problem. But the point is that scaling the current generation of tech will continue to have fewer returns.
    To make up for this, "AI" companies are now focusing on engineering. 2025 has been the year of MCP, "agents", "skills", etc., which will continue in 2026. This is a good thing, as these tools need better engineering around them, so they can deliver actual value. But the hype train is running out of steam, and unless there is a significant breakthrough soon, I suspect that next year will be a turning point in this hype cycle.
    ojbyrne3 months ago
    I’m curious how you deduced it’s from 2024. Timestamps on the article and the embedded video are both November 2025.
    latexr3 months ago
    It says at the top it was published Aug 20, 2024, and the Internet Archive has it since Nov 13, 2024.
    https://web.archive.org/web/20241113185615/https://epoch.ai/...
    ojbyrne3 months ago
    Sorry, I didn't follow the thread, thought you were referring to the top level article.
    latexr3 months ago
    I’m not the person you originally asked.
    techblueberry3 months ago
    Wait, nope because someone disagrees?
- rdedev3 months ago
  The 3rd graph is interesting. Once the model performance reaches above human baseline, the growth seems to be logarithmic instead of exponential.
- epistasis3 months ago
  That blog post is eight months old. That feels like pretty old news in the age of AI. Has it held since then?
  - conception3 months ago
    It looks like it’s been updated as it has codex 5.1 max on it
  - 3 months ago
    undefined
- an0malous3 months ago
  “Time it takes for a human to complete a task that AI can complete 50% of the time” seems like a really contrived metric. Suppose it takes 30 minutes to write code to scrape a page and also 30 minutes to identify a bug in a SQL query, an AI’s ability to solve the former has virtually no bearing on its ability to solve the latter but we’re considering them all in the same set of “30 minute problems.” Where do they get the data for task durations anyway?
JimmyBuckets3 months ago
I respect Ilya hugely as a researcher in ML and quite admire his overall humility, but I have to say I cringed quite a bit at the start of this interview when he talks about emotions, their relative complexity, and origin. Emotion is so complex, even taking all the systems in the body that it interacts with. And many mammals have very intricate socio-emotional lives - take Orcas or Elephants. There is an arrogance I have seen that is typical of ML (having worked in the field) that makes its members too comfortable trodding into adjacent intellectual fields they should have more respect and reverence for. Anyone else notice this? It's something physicists are often accused of also.
- fidotron3 months ago
  Many ML people treat other devs that way as well.
  This is a major reason the ML field has to rediscover things like the application of quaternions to poses because they didn't think to check how existing practitioners did it, and even if they did clearly they'd have a better idea. Their enthusiasm for shorter floats/fixed point is another fine example.
  Not all ML people are like this though.
- fumeux_fume3 months ago
  Yeah, that's bothered me as well. Andrej Karpathy does this all the time when he talks about the human brain and making analogies to LLMs. He makes speculative statements about how the human brain works as though it's established fact.
  - mips_avatar3 months ago
    Andrej does use biological examples, but he's a lot more cautious about biomimicry, and often uses biological examples to show why AI and bio are different. Like he doesn't believe that animals use classical RL because a baby horse can walk after 5 minutes which definitely wasn't achieved through classical RL. He doesn't pretend to know how a horse developed that ability, just that it's not classical RL.
    A lot of Ilya's takes in this interview felt like more of a stretch. The emotions and LLM argument felt like of like "let's add feathers to planes because birds fly and have feathers". I bet continual learning is going to have some kind of internal goal beyond RL eval functions, but these speculations about emotions just feel like college dorm discussions.
    The thing that made Ilya such an innovator (the elegant focus on next token prediction) was so simple, and I feel like his next big take is going to be something about neuron architecture (something he eluded to in the interview but flat out refused to talk about).
- Miraste3 months ago
  It is arrogant, but I see why it happens with brain-related fields specifically: the best scientific answer to most questions of intelligence and consciousness tends to be "we have no idea, but here's a bad heuristic."
- ilaksh3 months ago
  The question of how emotions function and how they might be related to value functions is absolutely central to that discussion and very relevant to his field.
  Doing fundamental AI research definitely involves adjacent fields like neurobiology etc.
  Re: the discussion, emotions actually often involve high level cognition -- it's just subconscious. Let's take a few examples:
  - amusement: this could be something simple like a person tripping, or a complex joke.
  - anger: can arise from something quite immediate like someone punching you, or a complex social situation where you are subtly being manipulated.
  But in many cases, what induces the emotion is a complex situation that involves abstract cognition. The physical response is primitive, and you don't notice the cognition because it is subconscious, but a lot may be going into the trigger for the emotion.
  https://cis.temple.edu/~pwang/Publication/emotion.pdf
  - anthonypasq3 months ago
    i think the contention is the idea that emotions are simple.
    ilaksh3 months ago
    Yes, that is what they were suggesting in the interview, which I think is not quite accurate, so I replied with the comment above.
- el_jay3 months ago
  ML and physics share a belief in the power of their universal abstractions - all is dynamics in spaces at scales, all is models and data.
  The belief is justified because the abstractions work for a big array of problems, to a number of decimal places. Get good enough at solving problems with those universal abstractions, everything starts to look like a solvable problem and it gets easy to lose epistemic humility.
  You can combine physics and ML to make large reusable orbital rockets that land themselves. Why shouldn’t be able to solve any of the sometimes much tamer-looking problems they fail to? Even today there was an IEEE article about high failure rates in IT projects…
- stevenhuang3 months ago
  It is not arrogance.
  It's awareness of the physical church turing thesis.
  If it turns out everything is fundamentally informational, then the exact complexity (of emotion or consciousness even, which I'm sure is very complex) is irrelevant; it would still mean it's turing representable and thus computable.
  It may very well turn out not to be the case, which on it's own will be interesting as that suggests we live in a dualist reality.
- jstummbillig3 months ago
  It seems plausible that good AI researchers simply need to be fairly generalist in their thinking, at the cost of being less correct. Both neural networks and reinforcement learning may be crude but useful adoptions. A thought does not have to be correct. It just has to be useful.
- dmix3 months ago
  Ilya also said AI may already be "slightly conscious" in 2022
  https://futurism.com/the-byte/openai-already-sentient
  - EA-31673 months ago
    I think a lot of this comes down to "People with tons of money on the line say a lot of things," But in Ilya's case in particular I think he was being sincere. Wrong, but sincere, and that's kind of a problem inherent in this entire mess.
    I believe firmly in Ilya's abilities with math and computers, but I'm very skeptical of his (and many others') alleged understanding of ill-defined concepts like "Consciousness". Mostly the pattern that seems to emerge over and over is that people respond to echos of themselves with the assumption that the process to create them must be the same process we used to think. "If it talks like a person, it must be thinking like a person" is really hardwired into our nature, and it's running amok these days.
    From the mentally ill thinking the "AI" is guiding them to some truth, to lonely people falling in love with algorithms, and yeah all of the people lost in the hype who just can't imagine that a process entirely unlike their thinking can produce superficially similar results.
  - Insanity3 months ago
    Any time I read something like this my first thought is "cool, AI is now meeting an ill-defined spec". Which, when thinking about it, is not too dissimilar from other software :D
- jonmc123 months ago
  I don't think trans-disciplinary inquiry is arrogance - the intellectual fields are somewhat arbitrary relative to how human expertise relates to real world problems. But, effective trans-disciplinary inquiry requires awareness of philosophical commitments, and familiarity with existing literature/theory.
  The bigger challenge might be that people with ML expertise need to solve problems of human-AI interaction and alignment because the training for the former is uni-disciplanary while the latter is trans-disciplinary.
- joshtbradley3 months ago
  I think smart people across all domains fall for the trap of being overconfident in their ability to reason outside of their area of expertise. I admire those who don't, but alas we are human.
- 3 months ago
  undefined
- AstroBen3 months ago
  What's wrong with putting your current level of knowledge out there? Inevitably someone who knows more will correct you, or show you're wrong, and you've learnt something
  The only thing that would make me cringe is if he started arguing he's absolutely right against an expert in something he has limited experience in
  It's up to listeners not to weight his ideas too heavily if they stray too far from his specialty
- t_serpico3 months ago
  I actually thought the opposite - that he seems to be seriously thinking about AGI from a broader intellectual standpoint than most ML researchers. With that said, I was a little confused when he said evolution was more optimized for locomotion/vision than language. Like yes, language is super recent, but communication in general is not.
- rafaelero3 months ago
  The equivalence of emotions to reward functions seem pretty obvious to me. Emotions are what compel us to act in the environment.
- NalNezumi3 months ago
  >There is an arrogance I have seen that is typical of ML (having worked in the field) that makes its members too comfortable trodding into adjacent intellectual fields they should have more respect and reverence for.
  I've not only noticed it but had to live with it a lot as a robotics guy interacting with ML folks both in research and tech startups. I've heard essentially same reviews of ML practitioners in any research field that is "ML applied to X" and X being anything from medical to social science.
  But honestly I see the same arrogance in software world people too, and hence a lot here in HN. My theory is that, ML/CS is an entire field around made-for-human logic machine and what we can do with it. Which is very different from anything real (natural) science or engineering where the system you interact with is natural Laws, which are hard and not made to be easy to understand or made for us, unlike programming for example. When you sit in a field when feedback is instant (debuggers/bug msg), and you deep down know the issues at hand is man-made, it gives a sense of control rarely afforded in any other technical field. I think your worldview get bent by it.
  CS folk being basically the 90s finance bro yuppies of our time (making a lot of money for doing relatively little) + lack of social skills making it hard to distinguish arrogance and competence probably affects this further. ML folks are just the newest iteration of CS folks.
- 3 months ago
  undefined
- slashdave3 months ago
  > It's something physicists are often accused of also.
  Nah. Physics is hyper-specialized. Every good physicist respects specialists.
- 3 months ago
  undefined
- mips_avatar3 months ago
  I think the bigger problem is he refused to talk about what he's working on! I would love to hear his view on how we're going to move past evals and RL, but he flat out said it's proprietary and won't talk about it.
wrs3 months ago
"The idea that we’d be investing 1% of GDP in AI, I feel like it would have felt like a bigger deal, whereas right now it just feels...[normal]."
Wow. No. Like so many other crazy things that are happening right now, unless you're inside the requisite reality distortion field, I assure you it does not feel normal. It feels like being stuck on Calvin's toboggan, headed for the cliff.
- hn_acc13 months ago
  Agreed.
river_otter3 months ago
One thing from the podcast that jumped out to me was the statement that in pre training "you don't have to think closely about the data". Like I guess the success of pre training supports the point somewhat but it feels to me slightly opposed to Karpathy talking about what a large percentage of pretraining data is complete garbage. I guess I would hope that more work in cleaning the pre training data would result in stronger and more coherent base models.
illwrks3 months ago
In the “teenagers learn to drive in 10 hours” part… that’s active learning, but they have spent countless hours in their life in a car, on a bus or other forms of transport, even watching the shows and movies featuring driving, playing with toys and computer games etc. There is years of passive information absorbed already before that 10 hours of active learning begins.
mrcwinn3 months ago
He also suggested the "revenue opportunities" would reveal themselves later, given enough investment. I have the same plan if anyone is interested.
highfrequency3 months ago
Great respect for Ilya, but I don’t see an explicit argument why scaling RL in tons of domains wouldn’t work.
- never_inline3 months ago
  I think that scaling RL for all common domains is already done to death by big labs.
- kubb3 months ago
  Not sure why they care about his opinion and discard yours.
  They’re just as valid and well informed.
- anthonypasq3 months ago
  doesnt RL by definition not generalize? thats Ilya's entire criticism of the current paradigm
bentt3 months ago
Is this like if everyone suddenly got 1gb fiber connections in 1996? We put money into the thing we know (infra), but there's no youtube, netflix, dropbox, etc etc etc. Instead we're still loading static webpages with progressive jpegs and it's like... a waste?
scotty793 months ago
Translation: Free lunch of getting results just by throwing money at the problem is over. Now for the first time in years we actually need to think what we are doing and firgure out why things that work, do work.
Somehow, despite being vastly overpaid I think AI researchers will turn out to be deeply inadequate for the task. As they have been during the last few AI winters.
measurablefunc3 months ago
I didn't learn anything new from this. What exactly has he been researching this entire time?
- xoac3 months ago
  Best time to sell his ai portfolio
martin823 months ago
That's a very diplomatic way of saying "we burnt all this money but we have not the faintest clue about how to proceed"
wartywhoa233 months ago
A steady progress implies transitioning between ages at least ⌊(year-2020)^2/10⌋ times a year, and entering at least one new era once in a decade.
3 months ago
undefined
bilsbie3 months ago
I don’t think either of those ages is correct. I’d like to see the age of efficiency and bringing decent models to personal devices.
- zerosizedweasle3 months ago
  Sure but that will also be a research age.
eats_indigo3 months ago
did he just say locomotion came from squirrels
- FergusArgyll3 months ago
  I think he was referencing something Richard Sutton said (iirc); along the lines of "If we can get to the intelligence of a squirrel, we're most of the way there"
  - Animats3 months ago
    I've been saying that for decades now. My point was that if you could get squirrel-level common sense, defined as not doing anything really bad in the next thirty seconds while making some progress on a task, you were almost there. Then you can back-seat drive the low-level system with something goal-oriented.
    I once said that to Rod Brooks, when he was giving a talk at Stanford, back when he had insect-level robots and was working on Cog, a talking head. I asked why the next step was to reach for human-level AI, not mouse-level AI. Insect to human seemed too big a jump. He said "Because I don't want to go down in history as the creator of the world's greatest robot mouse".
    He did go down in history as the creator of the robot vacuum cleaner, the Roomba.
- jonny_eh3 months ago
  timestamp?
lvl1553 months ago
You have LLMs but you also need to model actual intelligence, not its derivative. Reasoning models are not it.
gizmodo593 months ago
Even as criticism targets major model providers, his inability to answer clearly about revenue & dismissing it as a future concern reveals a great deal about today's market. It's remarkable how effortlessly he, Mira, and others secure billions, confident they can thrive in such an intensely competitive field.
Without a moat defined by massive user bases, computing resources, or data, any breakthrough your researchers achieve quickly becomes fair game for replication. May be there will be new class of products, may be there is a big lock-in these companies can come up with. No one really knows!
- luke54413 months ago
  He's just doing research with some grant money? Why would you ask a researcher for a path to profitability?
  I just hope the people funding his company are aware that they gave some grant money to some researchers.
  - jonny_eh3 months ago
    Exactly, as far as anyone outside of the deal participants knows, Ilya hasn't made any promises with respect to revenue.
  - singiamtel3 months ago
    Is it a grant? My understanding is that they're raising money as a startup
    https://www.reuters.com/technology/artificial-intelligence/o...
- SilverElfin3 months ago
  Mira was a PM who somehow was at the right place at the right time. She isn’t actually an AI expert. Ilya however, is. I find him to be more credible and deserving in terms of research investment. That said, I agree that revenue is important and he will need a good partner (another company maybe) to turn ideas into revenue at some point. But maybe the big players like Google will just acquire them on no revenue to get access to the best research, which they can then turn into revenue.
  - fragmede3 months ago
    That’s kind of a shitty way to put it. Mira wasn’t a PM at OpenAI. She was CTO and before that VP of Engineering. Prior to OpenAI she was an engineer at Tesla on the Model X and Leap Motion. You’re right that she’s not a published ML researcher like Ilya, but "right place, right time" undersells leading the team that shipped ChatGPT, DALL-E, and GPT-4.
    Nextgrid3 months ago
    “CTO” during ZIRP means nothing to be fair. You could put a monkey in front of a typewriter in that environment and still get a 50% chance of success, by the success metric of the time which was just “engagement” instead of profits. If you’re playing with infinite money it’s hard to lose.
- newyankee3 months ago
  Sometimes I wonder who the rational individuals at the other end of these deals are and what makes them so confident. I always assume they have something that general public cannot deduce from public statements
  - yen2233 months ago
    This looks like the classic VC model:
    1. Most AI ventures will fail
    2. The ones that succeed will be incredibly large. Larger than anything we've seen before
    3. No investor wants to be the schmuck who didn't bet on the winners, so they bet on everything.
    almostdeadguy3 months ago
    Most of the money flowing to the big players is from tech giant capex, originally from net cash flow and lately its financed by debt. A lot of these investors seem to now essentially be making the case that AI is "too big to fail". This doesn't at all resemble VC firms taking a lot of small bets across a sector.
    Nextgrid3 months ago
    Aka gambling.
    The difference is that while gambling has always been a thing on the sidelines, nowadays the whole market is gambling.
  - Nextgrid3 months ago
    If the whole market goes to bet at the roulette, you go bet as well.
    Best case scenario you win. Worst case scenario you’re no worse off than anyone else.
    From that perspective I think it makes sense.
    The issue is that investment is still chasing the oversized returns of the startup economy during ZIRP, all while the real world is coasting off what’s been built already.
    There will be one day where all the real stuff starts crumbling at which point it will become rational to invest in real-world things again instead of speculation.
    (writing this while playing at the roulette in a casino. Best case I get the entertainment value of winning and some money on the side, worst case my initial bet wouldn’t make a difference in my life at all. Investors are the same, but they’re playing with billions instead of hundreds)
  - 827a3 months ago
    There isn't necessarily rationality behind venture deals; its just a numbers game combined with the rising tide of the sector. These firms are not Berkshire. If the tide stops rising, some of the companies they invested in might actually be ok, but the venture boat sinks; the math of throwing millions at everyone hoping for one to 200x on exit does not work if the rising tide stops.
    They'll say things like "we invest in people", which is true to some degree, being able to read people is roughly the only skill VCs actually need. You could probably put Sam Altman in any company on the planet and he'd grow the crap out of that company. But A16z would not give him ten billion to go grow Pepsi. This is the revealed preference intrinsic to venture; they'll say its about the people, but their choices are utterly predominated by the sector, because the sector is the predominate driver of the multiples.
    "Not investing" is not an option for capital firms. Their limited partners gave them money and expect super-market returns. To those ends, there is no rationality to be found; there's just doing the best you can of a bad market. AI infrastructure investments have represented like half of all US GDP growth this year.
  - wrs3 months ago
    "Rational [citation needed] individuals at the other end of these deals"
    Your assumption is questionable. This is the biggest FOMO party in history.
- mrandish3 months ago
  > confident they can thrive in such an intensely competitive field.
  I agree these AI startups are extremely unlikely to achieve meaningful returns for their investors. However, based on recent valley history, it's likely high-profile 'hot startup' founders who are this well-known will do very well financially regardless - and that enables them to not lose sleep over whether their startup becomes a unicorn or not.
  They are almost certainly already multi-millionaires (not counting ill-liquid startup equity) just from private placements, signing bonuses and banking very high salaries+bonus for several years. They may not emerge from the wreckage with hundreds of millions in personal net worth but the chances are very good they'll probably be well into the tens of millions.
- markus_zhang3 months ago
  TBH if you truly believe you are in the frontier of AI you probably don’t need to care too much about those numbers.
  Yes corporations need those numbers, but those few humans are way more valuable than any numbers out there.
  Of course, only when others believe that they are in the frontier too.
- impossiblefork3 months ago
  I think software patents in AI are a possibility. The transformer was patented after all, with way it was bypassed being the decoder-only models.
  Secrecy is also possible, and I'm sure there's a whole lot of that.
- outside12343 months ago
  He has no answer for it so the only thing he can do is deflect and turn on the $2T reality distortion field.
  - signatoremo3 months ago
    Nobody knows the answer. He would be lying if he gave any number. His startup is able to secure funding solely based on his credential. The investors know very well but they hope for a big payday.
    Do you think OpenAI could project their revenue in 2022, before ChatGPT came out?
- alyxya3 months ago
  They have a moat defined by being well known in the AI industry, so they have credibility and it wouldn't be hard for anything they make to gain traction. Some unknown player who replicates it, even if it was just as good as what SSI does, will struggle a lot more with gaining attention.
  - baxtr3 months ago
    Being well known doesn’t qualify as a moat.
    mrandish3 months ago
    Agreed. But it can be a significant growth boost. Senior partners at high-profile VCs will meet with them. Early key hires they are trying to recruit will be favorably influenced by their reputation. The media will probably cover whatever they launch, accelerating early user adoption. Of course, the product still has to generate meaningful value - but all these 'buffs' do make several early startup challenges significantly easier to overcome. (Source: someone who did multiple tech startups without those buffs and ultimately reached success. Spending 50% of founder time for six months to raise first funding is a significant burden (working through junior partners and early skepticism) vs 20% of founder time for three weeks.)
    baxtr3 months ago
    Yes, I am not debating that it gets you a significant boost.
    I’m personally not aware of a strong correlation with real business value created after the initial boost phase. But surely there must be examples.
twodave3 months ago
I’d settle for the “age of being able to point the LLM at the entire codebase, describe a new feature, and see it implemented based on the patterns and idioms already present in that codebase”. My impression is the only thing between here and there is context size.
roman_soldier3 months ago
Scaling got us here and it wasn't obvious that it would produce the results we have now, so who's to say sentience won't emerge from scaling another few orders of magnitude?
Of course there will always be research to squeeze more out of the compute, improving efficiency and perhaps make breakthroughs.
- hn_acc13 months ago
  Another few orders of magnitude? Like 100-1000x more than we're already doing? Got a few extra suns we can tap for energy? And a nanobot army to build various power plants? There's no way to do 1000x of what we're already doing any time soon.
  - roman_soldier3 months ago
    10x to 100x and order of magnitude is a factor of 10.
zombot3 months ago
Shouldn't research have come first? Am I making any sense?
- shaism3 months ago
  Ilya mentioned in the video that 2012 and 2020 was the “Age of Research”, followed by the “Age of Scaling” from 2020 to 2025. Now, we are about to reenter the “Age of Research”.
alexnewman3 months ago
A lot more of human intelligence is hard coded
fuzzfactor3 months ago
This AI stuff is really taking off fast.
And hasn't Ilya been on the cutting edge for a while now?
I mean, just a few hours earlier there was a dupe of this artice with almost no interest at all, and now look at it :)
This was my feelings way back then when it comes to major electronics purchases:
Sometimes you grow to utilize the enhanced capabilities to a greater extent than others, and time frame can be the major consideration. Also maybe it's just a faster processor you need for your own work, or OTOH a hundred new PC's for an office building, and that's just computing examples.
Usually, the owner will not even explore all of the advantages of the new hardware as long as the purchase is barely justified by the original need. The faster-moving situations are the ones where fewest of the available new possibilities have a chance to be experimented with. IOW the hardware gets replaced before anybody actually learns how to get the most out of it in any way that was not foreseen before purchase.
Talk about scaling, there is real massive momentum when it's literally tonnes of electronics.
Like some people who can often buy a new car without ever utilizing all of the features of their previous car, and others who will take the time to learn about the new internals each time so they make the most of the vehicle while they do have it. Either way is very popular, and the hardware is engineered so both are satisfying. But only one is "research".
So whether you're just getting a new home entertainment center that's your most powerful yet, or kilos of additional PC's that would theoretically allow you to do more of what you are already doing (if nothing else), it's easy for anybody to purchase more than they will be able to technically master or even fully deploy sometimes.
Anybody know the feeling?
The root problem can be that the purchasing gets too far ahead of the research needed to make the most of the purchase :\
And if the time & effort that can be put in is at a premium, there will be more waste than necessary and it will be many times more costly. Plus if borrowed money is involved, you could end up with debts that are not just technical.
Scale a little too far, and you've got some research to catch up on :)
xeckr3 months ago
He is, of course, incentivised to say that.
- malfist3 months ago
  Researcher says it's time to fund research. News at 11
  - rvz3 months ago
    Exactly.
Mars0083 months ago
If I remember correctly after leaving OpenAI with a bang Ilya founded a company and attracted billions of $$ promising AGI soon. Now what?
a_state_full3 months ago
[dead]
nakamoto_damacy3 months ago
Ilya "Undercover Genocide Supporter" Sutskever... ¯\_(ツ)_/¯
jmkni3 months ago
This reveals a new source of frustration, I can't watch this in work, and I don't want to read and AI generated summary so...?
- cheeseblubber3 months ago
  There is a transcript of the entire conversation if you scroll down a little
venturecruelty3 months ago
When are we going to call out these charlatans for the frauds that they are?
3 months ago
undefined
nsoonhui3 months ago
One thing I’m curious about is this: Ilya Sutskever wants to build Safe Superintelligence, but he keeps his company and research very secretive.
Given that building Safe Superintelligence is extraordinarily difficult — and no single person’s ideas or talents could ever be enough — how does secrecy serve that goal?
- NateEag3 months ago
  If he (or his employees) are actually exploring genuinely new, promising approaches to AGI, keeping them secret helps avoid a breakneck arms race like the one LLM vendors are currently engaged in.
  Situations like that do not increase all participants' level of caution.
- 4b11b43 months ago
  Doesn't sound like you listened to the interview. He addresses this and says he may make releases that would be otherwise held back because he believes it's important for developments to be seen by the public.
  - giardini3 months ago
    No reasonable person would do that! That is, if you had the key to AI, you wouldn't share it and you would do everything possible to prevent it's dissemination. Meanwhile you would use it to conquer the world! Bwahahahaaaah!