Reality begs to differ [0] and following the link for that text goes to an article [1] where they talk about Google's TurboQuant which supposedly will lower the RAM requirements. Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined. The fact this article links there with text "RAM prices are crashing" throws the entire rest of the article into doubt for me.
RAM prices are most certainly not crashing (yet) and treating it as a forgone conclusion because _one_ lab found gains could be made and hasn't even reported on the efficiency of their method is just irresponsible. It's almost as bad as when LLMs link things to prove their point, you visit the link, and find it says nothing of the sort or even the opposite.
[0] https://pcpartpicker.com/trends/price/memory/
[1] https://tech.sportskeeda.com/gaming-news/how-google-s-new-tu...
I think it is determined:
Just noticed that pydry made a similar point: https://news.ycombinator.com/item?id=47574216
The fact that public LLM usage is leveling off at a price of $0 and Jensen "we make the shovels in this gold rush" Huang is rather desperately claiming that you need to spend $250k/year in tokens to be taken seriously suggests that demand saturation may not be that far off.
Whether Jevons' Paradox applies to software engineers I think is another open question. Im constantly being told that it doesnt and that LLMs make half of us redundant now, but Im skeptical - so much automation I see is broken or badly done.
I'm asking 'cos while I'm philosophically opposed to the first option, but I'd love to hear about anything that resembles the second.
This includes encouraging people to set up elaborate multi model set ups (e.g. "gas town") for coding that do not meaningfully improve productivity but which certainly do cause token usage to explode.
It also includes encouraging execs to use token consumption as a proxy for productivity - almost akin to SLOC.
AI has a halo right now and the managerial class seem to be willing to forgive almost any failure because the promise is so enticing. We're at peak expectations right now. They will soon start to be less forgiving when the warts which are intrinsic to LLMs remain unsolved.
As best as I can tell, that's the thinking. It's one number, it's very easy to find and manage, and there is a belief that it directly measures productivity.
I disagree that it does; seems to me the throughput of useful features is a better measure, but I'm not in the drivers seat on this one
The personal use case stuff is messy and subjective.
Cost savings is primarily a function of headcount here. Which is also easy to measure, and so if we take my thesis that easy to measure stuff is prioritized...
Ultimately the performance will be assessed via the income statement and cash flows of customers of the model producers.
Frankly in the window pre-IPO it’s in the best interests of OAI et al to show a line going to the top-right in relation to tokens, in their prospectus. What does that mean?
Strategic manipulation.
The market has achieved it's current saturation level with loss-leader prices that remind me of the Chinese bike share bubble[0]. Once those prices go up to break even levels (let alone profitable levels), the number of people who can afford to pay will go down dramatically (and that's not even accounting for the bubble pop further constricting people's finances).
If all they do is hike prices then they'll lose customers to competitors who don't or who find a way to serve a similar model cheaper.
The demand isn't going to go away purely through higher prices. Once people know something is possible they will demand it whether supply is constrained or not. That's a huge bounty for anyone who can figure out how to service that demand.
This is the first or second inning in the LLM rollout. It'll take 15-20 more years for full integration of AI agents into the life of the typical person.
The claw experiments for example can just barely be considered alpha stage. They're early AI garbage unfit for the average person to utilize safely. That new world hasn't gotten near the typical person yet.
The compute requirements to get to full integration of AI agents into the life of the average person - billions of them - is far beyond 10x where we're at now.
This is an argument in favor of demand having leveled off.
This doesnt track at all with my experience. Everybody is using it everywhere.
Moreover people are using them for daily life tasks even when it is not an appropriate use of LLMs - e.g. getting medical advice as you referred to or writing emails which are clearly pissing off their coworkers.
In this respect I see it as akin to radium - a new technology that got a little too fashionable for its own good when it first emerged and which will likely have many use cases scaled back.
No one in our Auto shop is using AI. One of the new diagnostic tools was demo'd with AI, and none of us were having it. It's about as accurate as Googling your symptoms.
My mother had an AI powered lung scan that came back with Stage 4 Cancer. The Oncologist got called in (for a fee!) to tell us it was just early stage COPD.
Personally I experienced this when a specialized doctor believed a drug interaction to be the opposite, thinking A hinders the absorption of B, when actually it hinders the clearance, tripling concentration of B.
Without AI, I would have been clueless about this and could not have spotted the mistake. I don't know if it would truly have been critical, but it did shake my confidence in doctors.
Id be careful stating this is an inappropriate use of LLMs. Im semi tapped in to the medical literature community and there is a lot of serious discussion and research going into the usage of LLMs for medical advice and most of it is showing that LLMs are barely worse than doctors, and much much cheaper/more convenient. They definitely arent ready to completely replace doctors, but it seems they can provide competent medical advice in a pinch. Look out for the literature on this in the coming year, its only the last few months that researchers seem to be taking LLMs seriously.
Like, "how was the medical advice" "worse than a doc's, but at least it was cheaper!"
A significant portion of americans detest the medical industry and deeply dislike going to the doctor so I dont even think the product needs to be very good to disrupt the way the system works, just different and accessible is likely enough. Funnily enough, restaurants where the food is bad but the portions are big are actually decently popular. Priorities can vary so widely that many people are unable to even comprehend the priorities a significant number of people truly hold.
I like that this comment is below, and posted after, an example where somebody had to pay extra money to clear up a misdiagnosis of stage 4 cancer by the “barely worse” software
Im certainly not saying fire all the radiologists, just advising an open mind when the actual literature starts saying that LLMs are as good as doctors in some areas.
Personally, I would have used all those tokens to generate synthetic data for IDA (iterated distillation and amplification) so that the more efficient 1000 token/answer chat model can answer more questions, but apparently that doesn't justify an insane datacenter buildout.
Claude Code and co. can now analyze an enterprise codebase to debug issues in a system with multiple services involved.
I don't see how that would have been possible at all in the past.
The ceiling of token use when everyone has something akin to OpenClaw just running as a background process on their phone is way higher than there’s supply for right now. Jevons paradox is still in full force.
The recent blog post from Google announcing TurboQuant does not change anything regarding RAM planning for the big labs.
TurboQuant itself is already a year old! So even smaller labs have probably seen and implemented it.
RAM prices falling during 2026 is insanely unlikely unless AI crashes so hard it starts to actually kill companies. And not just any but big tech
I'm not seeing that in 2026. Maybe 2027 (I'd sincerely doubt that too, honestly), but definitely not within the next 9 months. Their runway is _way_ too large for things to spiral out of control within such a short period of time
A month ago AI crash we looking unlikely but with the strait of Hormuz being de-facto blocked many predict a global stagflation which could affect AI too.
Ram prices are dropping
Cars come to mind instantly. Prices exploded in 2020/1, due to legitimate shortages, most of which have been plus or minus resolved, but the prices for new (and used!) cars never came back down.
Also the cost of shipping never came down and lots of cars and/or their components need to cross oceans. Plus we have a new energy crisis...
To be fair, they got it from us. This happened to me plenty of times long before modern LLMs.
My personal prediction is that once the VC bill comes due and prices for frontier models starts to climb, competition for efficiency will heat up. The main AI use-cases seem to be falling into buckets, and I doubt serving gigantic, do-it-all general models for every use-case under the sun is remotely cost-effective.
If common use-cases start to be more efficiently served by smaller, more efficient purpose-built models (or systems thereof), it'd make the big frontier models increasingly niche. Cursor's Composer 2 model is a great example of this.
In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.
> In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.
I sure hope so. RAM, HDDs, and SSDs are all crazy-high right now and I was in the market for literally all 3 but have paused all my buying because I can't justify the costs as they stand today.
That's totally fair. The article is written in a very odd way where it makes a bunch of authoritative, factual-sounding claims and then throws a "this is all very speculative" line right at the end.
It's very interesting speculation, but can't really be considered anything more than that, despite the prose it chose.
Stock price is the best forward indicator I can think of
1) Google releasing something probably means they don't see it as important. 4-bit KV-cache quantization has been known for a long time. The fact there is almost a mass hysteria about this paper makes me think there is a lack of skepticism in this AI mania, even in relatively tech-savvy crowd.
2) But prices for memory companies are crashing! look around, the whole market is crashing.
I haven't looked closely into TurboQuant, but perhaps it will revolutionize just as much as the 1-bit llm did...
Jevons Paradox. When are we going to learn that efficiency gains in AI does not decrease hardware usage?
> Reality begs to differ [0] and following the link for that text goes to an article [1] where they talk about Google's TurboQuant which supposedly will lower the RAM requirements. Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined. The fact this article links there with text "RAM prices are crashing" throws the entire rest of the article into doubt for me.
I find it fascinating how extremely reactive things have become. One research paper which, to my knowledge, hasn't been externally replicated yet, nor implemented, generate tons of hyperbolic article, tweets and such, and do actually manage to move the market at least temporarily. Not just this, but a simple message in full caps lock by the president of the U.S who is in the habit of lying through is teeth constantly, and the same thing happens. It's like there is a big bubble that threw any form of critical thinking out of the window and is in a hurry to react to anything even if it is not even remotely believable. Now I understand why it happens, there is a lot of money that can be made by capitalizing on FOMO, either by driving traffic to their website, socials, etc, or by simply insider trading (which feels like it has been legalized these days). But I still find it incredible the proportion it started to take.
Can you upgrade in the IDE? It would be strange that Google has a performance problem for paid users while I do not experience any such issues at all with Claude and Codex.
Honestly you're both wrong. RAM prices spiked speculatively, and they're going down for the same reason. Market people always want to argue in fundamentals, when in practice *ALL* the high frequency components of the signal are down to a bunch of traders trying to guess where it's going in the short term.
At best those guesses are informed by ground truth ("AI needs a lot of RAM!" "Sam cornered the marked!" "TurboQuant needs less RAM!"), but they remain guesses, and even then you can't tell the difference between that and random motion.
https://pcpartpicker.com/trends/price/memory/
Note how flat the black lines are.
Then note how wide the gray bands are. That makes it very easy to cherry-pick a few examples to present as "supporting evidence" that prices are doing whatever you want to believe they are doing.
Didn't OpenAI buy up 40% of the capacity all at once?
Freshman economics would say that supply is fine and that prices shouldn't move. But they did anyway. And the reason is speculation.
sure there is. not formally, but if you hold a contract for x units of future production, you can sell that contract to somebody else who wants those units more than you do.
Futures are standardised forward contracts traded on exchanges
Have we gotten anymore word on the potential Helium constraints that SK Hynix was making noise about after the strike on the helium plant in the Middle East that suppplied 60% of S. Korea's Helium? Because that could definitely put a kink in things, since SKH is one of the 3 remaining big DRAM producers.
https://www.mooreslawisdead.com/post/sam-altman-s-dirty-dram...
Its still speculative that OpenAI won't go bankrupt and have to free it back to the market, but if it is holding them unfinished it is a supply constraint on finished RAM chips even if not on wafer output.
Given Nvidia's CEO's agitation I would give credit to the prediction, and if it's correct the price will go back to what it was, or even lower of investment in capacity are made today.
A RAM price drop due to some magic efficiencies assumes everything else doesn't change, which I doubt anyone honestly thinks will be the case.
The cost to serve tokens is absolutely profitable today and that’s been true for at least a year. What’s unclear is how R&D and capex fit into the picture. I am not that pessimistic on this front either though. For the data center build outs, demand for tokens is still exceeding supply. On the R&D front, well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable, usually without a plan for success. In this current rush, companies cannot keep up with supply, it’s a much easier math problem when you have something that people want (tokens) and you need to figure out profitability when including R&D.
And unlike the traditional "this will replace humans right away", I think what this introduce is a lot of incentive to spread those token in places where there was never any incentive to hire a software engineer for previously. In turn, that will drive a lot of business activity in those area that will potentially fail given the current quality of the output.
This feels like a boom before bust scenario, and I'm not even sure if it will bust.
Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today. Not in the speculating scenario that cardiologist could be replaced with models.
We see this new trend of agentic coding, again a promise software will be written that way going forward, despite the number of fiasco already experienced when trusting a model turned bad. The use case may provide value, but right now all it does is fullfil the push for token consumption all these AI leaders are advocating for.
All that said the dotcom boom is extremely analogous and that crash was quite bad.
On the other end we have professionals happy to pay a subscription for heavier use, to build something in the hope to sell it.
I figured I don't believe in value when my dad explained to me his mate fired his team once he realised he could just pay 20 bucks for his Gemini account and run his business. I asked, do you call this value add? He said it must be, since he can produce the same output with no staff.
There is a confusion between profiting from a circumstance and value creation.
You create value if, say, you cure a disease. That it takes you an army of staff or extract maximum profit from it is just a feasibility formula.
That you make the cure more affordable is value creation.
That you cure the same disease but increase your profit doesn't create any value, except to yourself, for a while
Maybe you don't, but it's fairly obvious that a lot of things are changing and things are moving.
Maybe your dad's mate didn't have to expand on his business, good for him. Other business are expanding because they now can.
Will the positive overweigh the negative? Not necessarily, but to go "it's tulip" is the kind of argument so devoid of nuance that we shouldn't be discussing so on HN.
The overwhelming demand for token would not coming from people wanting a unique illustration - it would be from professional usage. In fact, I'm not even sure who is subsidized. The $20 subscription surely isn't being used fully across all members of that subscription.
2000's tech bubble was caused among other things over-investment to infrastructure and technology that had no users yet.
Totally different setup.
Does not mean AI boom will not turn to bust, but weak analogues generally don't help with understanding complex systems.
Claude helped me implement a ridiculous amount of features in my programming language. It's helped me migrate the heap to an easily moveable index-based object space. It's helped me implement generators. It's helped me implement a new memory allocator. It's helped me fix a ridiculous amounts of bugs and make a huge number of small improvements everywhere. Its ability to provide me repository wide code review was a game changer for a solo developer like me. And it's doing so much more than that. I got more things done in the past few weeks than previous months even though I'm evaluating, learning, understanding and rewriting the AI output.
It's actually addictive to build things with Claude. The usage limits are starting to make me anxious, just like withdrawal syndrome. I applied for their open source max subscription program even though I'm too small for it because who knows, I might get in anyway and it costs nothing.
AI is quite literally a world changing technology. I hope the open models keep steadily progressing and that hardware remains available to all so we can run our own models on our own computers one day.
Just far cheaper (if you are in USA) and probably more useful in terms of job prospects.
Agentic coding absolutely blew up from demand, users are not being tricked into paying $200 a month, and they’re not complaining about hitting rate limits because it’s useless.
users are not being tricked into paying $200 a month
I can't believe people actually believe that people and companies are tricked into paying for tokens. My $20 Codex subscription is so useful, I can easily see myself paying $200 for it.This belief is so common amongst AI collapse people online. I'm guessing these people have only used free ChatGPT or worse, they use Windows and get Copilot shoved down their throats?
Meanwhile, I'm flying around with a $20 Codex subscription doing everything from writing code, analyzing stocks, coming up with ideas, etc.
IMO if someone tried this tech last time 6 months ago, or their only exposure is eg. via MS copilot, they do have a rational reason for skepticism. No technology of this complexity has improved this rapidly in my memory (well, ok, we had the CPU speed races from 90's to early 2000's).
From the 80486 to AMD Athlon64 X2 and much of that progress was enabled by better EDA being run on the more powerful CPUs being made with each improvement.
Now, we have better models helping to create even better models.
Let's say they have already plateau. But hardware continues to get better, right? So tokens should go down in price, not up. Since they're already 50%+ on inference today, better hardware would allow them to generate more tokens for less money.
I would pay $500 to start, build stuff with it, then keep going up the tiers as the stuff I'm building makes money.
It's adding tests for me and doing medium complexity refactors that I'd otherwise have to spend hours on
And based on reality (code) rather than my feelz of what I vaguely remember the code to have been doing in some long past.
there is an even larger force on HN that financially _needs_ the value of tokens to be inflated (so much so that bots have overwhelmed the site)
Coding, writing, summarizing, translating, data analysis, customer support, test generation.
Like the OP said, it's incredible how polarizing this debate is. When I read comments like yours, I feel like a significant part of the global workforce in IT must be living on another planet? Or they never really used Claude Code, Codex, OpenCode, ... intensively before because of company policies?
I legitimately am at least 10x more productive than a year ago, and I can prove it in number of commits and finished monetizable features developed per day. Obviously my workflows still very much require an active, constantly context-switching human-in-the-loop, but to me there's absolutely no question both output volume & quality have skyrocketed.
There are millions of other wanna be engineers doing exactly the same, assuming demand will scale as much as the offer.
What returns are you getting on those?
Let me create 500 websites, deployed for free, I hand that over to you by end of day. Will you give me a cent per piece? If so, happy to do business with you.
I would happily pay $200 a month for this. Luckily I dont need to, it's free.
Literally every game and website that I would have had to pay someone else to make I can now make myself. There's no value in that?
A year ago the best free LLM couldn't even give me a basic gridmap and collision. Now it can give me a full RCT style prototype & editor in 20 iterations.
I can only imagine what improvements we will have NEXT year!
Ponder that for a minute.
There are over 2 million games, for Android alone.
That you weren't making games before the advent of LLMs makes it cool for you to build, and at no cost. But people have been able to make games without them and already grew the market to saturation.
If the outcome of LLMs is that we get more games, it won't imply that people will consume more games. Most games never get played anyway.
Op never said they're selling games. They said they're making their own games and websites for a fraction of the cost (even $0). That's amazing value. And it's just getting better.
I didn't mean to patronize, sometimes self evidence isn't trivial to notice.
That claim is totally worthless without you providing concrete information how you measured that.
Value add so far lacks evidence.
Layoffs. It justifies them to the public. I'm not certain it grants them as it contradicts a principle of enterprise: scale, as much as you possibly can.
If tokens provided value today, we would be hiring more engineers to review their output and put things together.
That number is at least tenfold of what it was before, simply because I can run a lot of gruntwork in parallel now without wasting brainpower and focus on that stuff.
Maybe we are closed to singularity, or maybe we'll just plateau somehow. But either case there are so much work to support the breakneck change that are not done because the change take priority every single time, there should be a lot of things to work on.
The question is how big the fail is if you measure it in 3 month increments going back to late 2022.
As long as there are more amount of success, then it should be net positive.
> For the data center build outs, demand for tokens is still exceeding supply.
Can you provide any numbers for this?
Now we don't know the true size of any of the proprietary models, but my educated guess is that Sonnet is in about the same parameter range, just with better training and much better fine tuning and RLHF. Yet API pricing for Sonnet is $3/MTok input + $15/MTok output, exactly six times as expensive. Even Haiku is twice as expensive as Kimi K2.5.
I find it difficult to believe in a world where those API prices aren't profitable. For subscription pricing it's harder to tell. We hear about those that get insane value out of their subscription, but there has to be a large mass who never reaches their limits. With company-wide rollouts there might even be a lot of subscription users who consume virtually no tokens at all.
This is false. We may assume it's the most efficient way of generating revenue given their GPUs, but their overall profitability will just be a guess. They would still have incentives to run hardware at maximum, even when it's uncertain to eventually recoup costs.
> a world where those API prices aren't profitable
A lab with employees and models in training has other costs than the operating expenses of a GPU farm.
If they're losing money and have no VC backing, they'd just turn off the lights.
But that's moving the goalposts? The original claim was on inference itself, not the whole company.
> The cost to serve tokens is absolutely profitable today and that’s been true for at least a year.
Everything is profitable if you ignore the costs.
Are you sure? Surely there is a lot of interesting data in those LLM interactions.
It's fair to say that if all these operators are competing for tokens, that the OpenRouter token operator (not sure the exact phrase but the people running the models) are accounting for some level of margin.
However, how many of these are running their own data centers and GPUs?
If they are running their own infrastructure, then it's not a simple equation of if each specific token set is profitable, since it needs to account for the cost of running the data center. It could be that they believe that it is profitable in the long term by utilizing the long tail of asset depreciation, but that isn't guaranteed.
IF they aren't running their own infrastructure, then it's much easier to claim that it's profitable and has a margin (outside of running their servers to manage the rented infrastructure).
HOWEVER, a lot of data centers have some pretty crazy low prices for GPUs that may be vying for user base and revenue over profitability. In these cases, if data center growth starts slowing due to slower buildout then it's very likely GPU prices go up and inference stops becoming profitable for the open router owners.
So long term it's not clear how profitable even these open models are.
OpenAI and Anthropic definitely fall into the latter category too. Their infrastructure requirements are much higher than the open models, and they are being given huge discounts so Microsoft/Amazon/Google can all claim revenue (since they have profitability coming from other parts). It's not clear if OpenAI and Anthropic models would be profitable at inference if they were paying rates that cloud hosts would make a profit from.
There's just way too many dimensions to this scenario to flat out state that open router proves inference is profitable at scale.
That gives you a very good estimate of "how much can you serve the tokens of a model of the size N for while making a profit".
Now, keep in mind: Kimi K2.5 is 1T MoE. Today's frontier LLMs are in the 1T to 5T range, also MoE. Make an estimate. Compare that estimate with the actual frontier lab prices.
In the current volatile environment, the API prices are more of a baseline where we can assume it can't be much cheaper to operate these models.
For supply look at outages and growth rates at companies like openrouter. The demand is growing every week.
This is why switching to local open weight models saves a lot of money. (Even though it’s not apples to apples.)
This is the opposite of an AI bubble burst.
300k tokens for that hour.
OpenAI charges $6.
Those are pessimistic assumptions.
If not, you aren't breaking even.
(1) Rent your GPUs.
(2) Pay list price, no volume breaks.
(3) Get only 85 tokens/sec. Realistically, frontier models would attain 200+ tokens/second amortized.
Inference is extremely profitable at scale.
You're generating about 36 million tokens/hour. Cost of Mixtral 8x7b on Open router is $0.54/M input tokens. $0.54/M output tokens.
You're looking at potentially $38.88/hour return on that H100 GPU. This is probably the best case scenario.
In reality, inference providers will use multiple GPUs together to run bigger, smarter models for a higher price.
“Technically correct. The best kind of correct”. So inference may technically be _capable_ of being profitable, but I have question’s about them being profitable in _practice_.
It’s insane
In the end value per user is what matters in relation to being a healthy going concern and valuation in relation to Meta for example. Value per token is what should matter too - after all that’s what people are paying for.
'Overinflated' relative to what? You make some good points but I don't accept this as a premise.
Median senior SWE salaries in SF: https://www.levels.fyi/t/software-engineer/levels/senior/loc...
Median income in metro areas: https://www.cnbc.com/2024/07/11/the-median-salary-for-the-25...
Engineering salaries are significantly higher than nearly every other industry on average and on median. Much of this is driven by VC funding rather than sound, profitable, bootstrapped businesses with sustainable profit margins.
Engineering salaries have also been driven upwards significantly the past ~10 years (since the post-2008 crash recovery), while wage growth in the US is mostly stagnant. I don’t have a source handy for that, but there are plentiful studies.
Outside of the US this may be less true, but I took GP’s “most of us on HN” to mean people who work in US tech companies which are primarily concentrated in high COI areas.
There was a surge in demand for SWEs and scarcity brought salaries up. Are them too high? Hell no. On average, my colleagues and me generated ~2M$ each in 2025 for our company, while we get payed a fraction of that (grants and bonuses included). If you look at net income per employee we are at around 700k each in 2025.
Additionally, employers try their hardest to drive costs down (eg. offshoring as much as possible, everyone doing layoffs at the same time, ...) and average/median salaries remained high. If the salaries were overinflated those numbers should have came down I believe. The fact that they didn't makes me think that it still is a scarcity problem not an overinflation one.
So by that logic, housing in coastal cities also aren't "overinflated"? After all, like SWEs, they're they're also scarce and in demand. They're also providing enormous value to the people buying/living in them, otherwise they'd be living in Oklahoma or whatever and paying a fraction of the cost.
Is the same on the job market? I don't think so. I never heard any SWE saying "let's scare people away from a CS career so we can bargain for higher salaries". The opposite is true though. Companies participate in career fairs, pre-uni events to make people gravitate towards a CS careers, ... so with a higher supply each employee loses a bit of bargaining power.
Small excursus, this very fact was taken to the extreme in 2022 when everyone did layoffs at the same time despite the numbers being still great. If you put 300k people on the street at around the same time you can hire some of them for way less money as they now lost all leverage (since there are other 299.999 people waiting in line for a job).
now compare the profit per employee at tech (software engineering) companies and those industries..
But there are thousands if not tens of thousands where the profit per employee is minimal or negative.
I can’t find a source for all tech (the data wouldn’t exist for private firms anyway) but I think it’s telling to look at this list, scroll down to about the middle and look around at salaries you or your colleagues are pulling. Software revenues are certainly high but the industry is afloat because of these high margin businesses creating returns so that low margin businesses can exist. Without the massive infusion in upfront capital, very uncommon in other industries, it’s simply not sustainable.
Typically a market that’s buoyed by its top performers but has significant amounts of capital tied up in under performers is called “a bubble”.
How can you possibly say that? Everyone knows that's not the case, these companies are losing money every day selling tokens. Revenue is not the same thing as profit.
This is why they were freaking out about DeepSeek just taking the trained model weights and slapping an interface on it.
Of course they are profitable if you ignore their cost to bring a product to market.
Yeah it should be factored in, but it’s a different set of implications for long term sustainability. They don’t actually need to test and optimize a new menu every day or week. If they decide to just stick to the same one longer they can get way more return from each dollar spent on development. It’s just that right now the rate of improvement you get with training is really high and nobody can afford to fall behind their competition.
Yeah, sure you can ignore the cost of purchasing the building for the restaurant for most profitability calculations, but if every year or two you were tearing down your old building and building a new one, you better believe that has to be in your profitability calculations.
If you can simply not remodel your restaurant and keep making money, then yeah, it makes sense to call it profitable.
Currently on a given day I'm chewing through approximately the equivalent of my lunch money, but where there's opportunity to extract wealth, someone will find a way to do it.
The wealth of great open models provide an excellent base for fine-tuning, distillation, and RL. I see a lot of untapped potential in the field of bespoke, purpose-built models that can be served far more cheaply than the frontier competition. I would not be surprised if we see frontier-adjacent experiences running comfortably on a Mac Mini by year end.
With frontier models seemingly hitting diminishing returns in quality, I struggle to see a world in which gigantic, expensive, general-purpose models don't become increasingly niche.
But there is no real higher limit. Imagine a LLM which could answer the question "what does my company need to do to beat the competition?". And then realize that the competition asks their LLM the same question. So now everybody is bidding the price up or using more tokens to get a better answer
In that world there’s no reason for a business enterprise to exist.
Can you explain why you know better than the analyst at Cursor cited in this article?
> well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable
This is a really concerning perspective: people were paid what they were worth. Software is or was one of the few remaining arenas wherein a person can find a middle or upper middle class lifestyle consistently.
I will also note: a startup raising an 8 MM series A and eventually fizzling out is not the same at the hundreds of billions invested in these AI companies without a path to profitability. It is utterly absurd to pretend these are the same thing: any company ingesting that much cash needs to justify its capacity to survive.
Software salary inflation and expansion has made this the case. Tech’s accessibility to the educated has accelerated gentrification massively, rising up prices on rent and food. While the statement is correct, tech’s contribution to income inequality is part of the issue. If you’ve lived in Austin or Chicago (especially Austin) prior to ~2010 you’ll have seen this first hand.
What, why? There are tons of low-margin capex-intensive business out there.
I think AI will end up like being like hosting. All the models will converge to being pretty-decent and the companies will have to compete on efficiency since they are selling a generic commodity.
You can already see Anthropic fears this scenario since they try so hard to make people use their first-party tools rather than plugging Claude in as a generic part of a third-party stack.
LLM hosting is the next VPS.
The salary jab was probably a little harsh.
Your ending is a bit of a fizzle too. There are many capex intense businesses that do just fine.
I want to add something additional to this: it is one of the few fields that can afford middle or upper middle class lifestyle and is accessible.
I have no doubt if I could redo my life with the necessary resources I’d be more than capable of putting myself through med school and gone with a secure career that paid more than I ever made in software.
But at this stage of life? I don’t have the time or money to spend a decade+ paying some institution tens of thousands of dollars to hopefully maybe have a real career.
Once software as a career dies, I suspect many will find themselves locked out the middle class for generations.
On the other hand, once software as a high paying career dies there will be nothing to prop up the status quo (high cost of housing, for example) so the middle class will return to being much more accessible to modestly paid jobs.
Even interpreting what-they-were-worth in the usual sense, I’m not so sure about this. We have seen wage collusion reported by the usual US West Coast-based companies. And some news on here[1] have reported that some engineer with a salary of $100K[2] might be producing $1M of value. And even factoring in the usual “but benefits and overhead” comes out to a solid factor of profit per programmer/engineer.
Despite that the sense I get (only from this site since that is my only reference) is that the so-called overpaid engineers are incredibly content to just have this happen to them. As long as they are paid well compared to other workers, it’s fine. No matter the profit factor. In fact, the discourse is very much focused on how “privileged” they were if the tide ever changes. Instead of realizing how much value they provided, collectively.
Outlets for capturing more of the value they create is entrepreneurship (Hello HN). Never any collective organizing. And entrepenurship is easily bought via aqcuisition.
Collective bargaining would have been relevant in case they ever get automated... by the very software they co-created.
One could imagine that this “privileged” collection of programmers could have served as a vanguard for the collective good of programming professionals as well as collective ownership of software goods, using their privilege to that end. The former never happened, and the latter is partly realized in people’s free time (see the OSS maintainer in Nebraska meme).[3]
[1] All from recollection since this is just news from the Frontier to me
[2] Of course the pay might be much higher now; this might have been a while ago
[3] when it isn’t simply exploited by corporations just using OSS without giving any back; a logical turn of events when no license or law forces them to contribute back
Well I’m sure they’ll be thrilled to know they can collect $100 a week more in unemployment benefits than their neighbor.
The parent comment doesn't discount that, only pointing out that "what they were worth" was inflated due to a speculative environment. Wherein lies your concern?
“Inflated due to a speculative environment” is not an accurate way to frame labor prices that held for many years. At that point, the prices were simply high due to high demand relative to supply (compared to other types of labor).
That goes without saying. The investigation here is into demand. Which was said to be overinflated due to speculation. As noted, many of the companies hiring the developers did not have viable businesses.
Salaries across industries in the US have remained flat since the 1970s. Calling the one sector that can provide access a middle class lifestyle inflated s to play into a narrative capital is eager to tell, even if OP didn't intend that.
What do you mean? The real (meaning adjusted for inflation) hourly wage in the US has increased by around 20% since 1970.
What has changed since the 1970s is that wages are no longer coupled to productivity. Perhaps that is what you are thinking of? But that should be an obvious truism for anyone in tech. We create the very things that cause that to be the case!
What happened in the 1970’s was the NeoLiberal shift and wasn’t caused by software.
If we — those with the power to build the productivity creators — took a stand and said "we refuse to create tech for the interests of the few" it would have never happened. But, instead, we welcomed it and are responsible for it.
So no. It wasn’t caused by tech beyond the uninteresting factors like modern society being complex and, of course, that tech developments influence things (pretty much all things).
The benefactor of those gains was also entirely decided by those who created the tech. We could have given use of that tech to everyone. In some cases we actually did (e.g. open source), but in most cases we gained (at least partial) ownership of the capital so it was in our best personal economic interest to keep it for ourselves and our close friends.
Would have been impossible without and being caused by are different things.
The sense of being “caused by” in a political context are the people who have the power to direct things. Which are not necessarily the people who implement something.
> The benefactor of those gains was also entirely decided by those who created the tech.
You assert that they were decided by. Based on what?
The vast majority of tech work was done in employment, either for some government or for private entities. The private entitites were controlled by Capital. The governments were controlled by democratic forces and Capital.
> We could have given use of that tech to everyone. In some cases we actually did (e.g. open source),
Again I reference the meme of Overworked Nebraska OSS Maintainer.
The impressive work done on OSS by tech workers directly have been done in their free time. The bulk of OSS work done by people as a living is probably through corporations like e.g. Intel working on the Linux Kernel.[1]
That impressive free time work has gotten the reputation as a treasure trove for the highly motivated and tech literate. In contrast to something that regular people can plug-and-play as an alternative to Big Tech dominance.
> , but in most cases we gained (at least partial) ownership of the capital so it was in our best personal economic interest to keep it for ourselves and our close friends.
Yes, well played. For those that got away with their financial-independence millions. For the rest, well, I guess they never managed to learn the moral lesson of Monopoly.
[1] Or am I wrong here? I could be off-base.
While you are right to recognize that there was some attempt to inject political context, it was not there originally, and is not the main discussion taking place. The fact that wages and productivity have become decoupled is not inherently political. It is but simple mathematics. Tech is the cause for the decoupling; it is why we have been able to become continually more productive and at an accelerating rate.
> The vast majority of tech work was done in employment
Yes, but generally even where employment is present tech workers also demand a share in ownership (e.g. stock). Tech doesn't invent itself. At least not yet. The workers have held the cards. Even those who haven't won the lottery are still in a pretty good economic position, relatively speaking.
In accounting, almost anything you want can be true, at least for some time.
> OpenAI is struggling to monetize. They turned to showing ads in ChatGPT,
The ads aren’t going into your paid plans (except maybe a highly discounted tier, depending on the market). The ads are a play to offer a free version. Having an ad-supported free tier isn’t new.
The discussion about being unprofitable also repeats the reductionist view that these companies are losing money and therefore the business model doesn’t work. This happens with every VC cycle where writers don’t understand that funded companies are supposed to lose money while they grow. That’s what the investment money is for.
We have very strong indicators that inference is not a money loser for these companies and is likely very profitable. They should be spending large amounts of money on R&D to get ahead and try new things while they’re serving up tokens.
The “but they’re losing money” argument never seems to be brought out against competitors that literally give away their models for free and for which we can calculate the cost of serving 400B-1T parameter open weight models.
Sounds like it is new for ChatGPT though. That's also how it started with TV and Youtube, first on the free tier then expanding to the paid ones.
This statement doesn't discount the original statement: that ads are going into GPT, which Sam called a last resort.
> The discussion about being unprofitable also repeats the reductionist view that these companies are losing money and therefore the business model doesn’t work. This happens with every VC cycle where writers don’t understand that funded companies are supposed to lose money while they grow. That’s what the investment money is for.
Usually propped-up companies don't last in the long term once the VC subsidy runs out. There's a difference between getting VC money in order to buy rocket parts, and getting VC money in order to charge $7 when you would really need to charge $10. The latter problem never goes away.
Why is OpenAI specifically losing money hand over fist then?
However, it seems to make a lot of sense. Anthropic literally added $6b ARR in February 2026 alone. I doubt training costs go up that fast.
And it's already mentioned that the path to profitability is that inference revenue eclipses training costs. It's already happening rapidly.
It seems like you're arguing that the bubble is going to collapse soon, like the author? How can it collapse when the demand is so much bigger than supply? Do you think the demand is fake? Or that AI will stop making progress from here on out?
This is deeply ironic in a way. Because the whole premise of AI labor replacement is that AI does not need to be better than human labor, it just needs to be cheaper with acceptable performance. But the same is true one step down: discount AI doesn’t need to be better than bleeding-edge AI, it just needs to be cheaper with acceptable performance.
To be fair people aren't exactly bullish on the prospects of deepseek or z.ai either, it's just they're below radar so they don't get mentioned.
https://en.wikipedia.org/wiki/Z.ai
>> "On 8 January 2026, Z.ai held its initial public offering on the Hong Kong Stock Exchange to become a listed company.[24][25][26] It is considered to be China's first major LLM company that went through an IPO.[26] In February 2026, JPMorgan Chase recommended to investors of purchasing stocks of the company alongside MiniMax.[27]"
https://www.zhipuai.cn/investor_relations/
But I haven't looked into it.
At what point do we declare that a company has "grown" and now must make money? OpenAI is a multi-billion dollar company right now, surely that's a point at which they should be profitable, instead of propped up by further investment and borrowing.
> We have very strong indicators that inference is not a money loser for these companies
All of the economic analysis that I've read strongly states the opposite. Running a GPU is a net loss /even for the data centre operators/. For them to break even, they currently charge OpenAI/Anthropic/Etc more than OpenAI/Anthropic/Etc make per-token.
yeah i was wondering why my bullshit detector was going off. This feels as if someone who cooks for Ramsey's kitchen is trying to predict the end of the market hike.
The strategy is always:
* Build something useful
* Give it away for free to get people exited
* Convince investors that this is going to rule the world
* Grow to dominate the world
* Enshittify
Naively speaking, I have so many expectations for the impact of this tech.
I'd expect a noticeable uptick in applications published on Google, Apple and Microsoft app stores. I'd also expect an uptick of games published to Steam. I'd expect an uptick in Github repos and libraries on PyPi.
I'd also expect some impact on the GDP ⸻ a non-negligible part of running a business is communication, planning, ads. Naively, I'd expect that LLMs should be able to both speed some of these things up and lubricate others.
I'd also expect that large corpos like Microsoft and Apple would have more resources to spare on the essential details of their OS like having a functioning taskbar or a predictable, consistent GUI.
I'd expect increased SAT scores or improved PISA results. Maybe even improved mental health, let's go wild.
It's strikes me as a reasonably useful tool, personally.
Yet, where are the goods in the aggregate?
So while AI made coding maybe 110% faster, it has also made literally every other person in the process lose their gd minds and they're wanting to break or skip everything else in the process to just shit out code faster.
I have started making an indie game, as one does, and it’s easily going 2-4x speed, but even still I’d predict a year of free time development with focus to ‘finish’ this thing. But the latest agentic tech is 3 months old.
Wow, I'm impressed at your usage of this. Apparently it's 0x2E3B, named "three-em dash".
You must be human!
On Linux you press Ctrl+Shifs+U and then type 2E3B, then press enter.
This is most likely wrong. Lab executives insist that serving tokens is profitable. It's the cost of training next-gen models that requires them to keep raising ever larger rounds. More importantly, many independent providers price tokens of open-weight models at a fraction of Anthropic's prices.
OpenAI's numbers show that they definitely are not profitable on inference, and even worse, revenue growth scaled linearly with inference cost from 2024 to 2025, which means they can't outgrow this problem. See https://www.wheresyoured.at/oai_docs/
Not hard to believe they're lying about other things when they've been lying about the capability of their products since inception.
[1] https://www.reuters.com/commentary/breakingviews/anthropic-g...
I don’t necessarily see a contradiction. $19B run rate, achieved very recently, is actually consistent with $5B lifetime earnings, because their growth curve is so sharp. Zitron is not good at math.
Its like asking how fast a car is moving.
Did Enron start a business school I'm unaware of something?
I'd be surprised if they're making money on inference just from that. There's no way someone paying $20 p/m and using it all day is not spending way more on even just the electricity for tokens, let alone the capex.
Key points - if you compare it to openrouter costs for ~similar sized models it is ~90% gross margin.
And this claim came from Cursor - not Anthropic!
Maybe marginally profitable, but right now they need to give out subsidies for people to use their products (Antigravity, Codex, Claude Code et al) in an actually useful manner that prevents churn and at the scale they need to justify usage growth forecasts, which they need to keep the wheel turning.
Probably if you look at the users who exclusively use the simple chat box interfaces (i.e. ChatGPT, Gemini in UI, Claude in UI) plans it is actually profitable, but I'd also say that's not where most of the usage comes from.
I'd love to actually look at both usage + profitability from each user segment to see if their PxQ growth expectations from non-enterprise usage make any sense.
> Many independent providers price tokens of open-weight models at a fraction of Anthropic's prices.
Are those open-weight models as good as Anthropic? Are they the same parameter class?
Are they as good as Anthropic was one year ago? That's more like it. They don't have to be just as good, they just need to be the most worthwhile for the price. If frontier models are only providing a negligible advantage for what they charge, that absolutely matters.
The question is more around the moats that these companies have and it seems to me while their models are amazing technology, they don't really have a moat. The open/chinese models still continuously catch up to the american ones.
Another scenario is that dense models get replaced entirely, in which case the likelyhood of OpenAI and co pioneering the concept is pretty slim. They will be left with billions worth of infrastructure which cost them 10 times that 2 years earlier, faced with the reality touched by the article: liquidate.
They would have a period of great margin, followed by possibly zero margin as enterprises move to free options.
They would have to come up with a lot of great products around the inferior models to justify charging at that point.
Even so, their subscriptions are significantly cheaper than the token pricing via API. So at some point they will need to get rid of subscriptions or increase the subscription prices dramatically... And that's assuming their current token pricing is actually profitable. Which it probably isn't.
Lastly, I would not trust one word that comes out of an executive of an AI company (or any other large company, for that matter).
I suspect that once the models hit a point of “good enough” for certain use cases companies will start putting R&D focus in other areas that may be less expensive. Like figuring out how to run more efficiently, UI/UX conventions that help users get what they’re trying to accomplish in fewer steps, various kinds of caching of requests, etc. So the cost to serve tokens over time should only come down, and will probably start coming down more rapidly as the returns to model training slow down.
That’ll probably be a while though, because each successive model tends to be a lot better than the last.
It hints that once these labs get a good enough "everyday model", they can work on efficiency so they can serve these models on old hardware. Which is almost certainly already happening.
Meanwhile companies like Google will keep investing on training...
Anthropic's CEO has suggested all AI companies should slow down training but obviously this is only beneficial for companies that can't afford to keep training.
If we can expect the past 15 years of software UI/UX history to continue, it's more likely they'll spend the money on making the UI/UX more confusing, removing features, and making basic tasks take more steps than they do today.
I'm not saying they're wrong, but I don't take much stock in their words.
It'd be interesting to see what they spend all the money on though as we seem to be hitting diminishing returns and I'm not sure if the typical enterprise user really cares about small improvements on benchmarks.
It seems like it'd probably be better to spend all that on marketing, free trials, exclusivity/bundle deals etc. ChatGPT already has a strong advantage there as it has so much brand recognition. I've seen lay people refer to all LLM's as ChatGPT like my grandparents did with Nintendo and all video game consoles.
Even if ChatGPT has brand recognition amongst lay people, your grandparents aren’t the ones shelling out $200/mo for a Claude code subscription and paying for extra Opus tokens on top of that. Anthropic’s revenue is now neck and neck with OpenAI, but if tomorrow they increased the price of Opus by 5x without increasing its capabilities, many would switch to Gemini, GPT 5.4, Cursor, or any cheap Chinese model. In fact I know many engineers that have multiple subscriptions active and switch when they hit the rate limits of one, precisely the tools are so interchangeable.
At some point it could even become cheaper to just buy 8x H100s and host Qwen/Deepseek/Kimi/etc yourself if you’re one of those companies paying $3k/mo per engineers in tokens.
absolutely isn't! if billed per token, there is no reason to be married to a single model family provider at all. the models have very different strengths and weaknesses, you should be taking advantage of this at all times.
regardless, eventually Google became the universal default for both. When it comes to software, the average person doesn't shop around for the technologically optimal choice, they just use what everyone else is using.
That's why ChatGPT still has a free option. If they didn't, they would lose a billion users overnight to Gemini.
there are a lot of reasons, but in brief - I think AI desktop use is a product that the average person isn't going to get much value out of. to make an analogy - the creators of Segway thought people would buy them in large numbers, but it turned out most people don't mind walking manually (or at least, don't mind it enough to spend money on a scooter). I think makers of AI Desktop Use products are going to find out the same thing as it relates to everyday tasks like checking email and shopping.
in fact even in rare cases where it's not possible to get an API or CLI to interface with some piece of software, I think people will find that their best bet is to first create a deterministic screen-scraping program for that specific software, then have that program serve an API for the AI to use. it would be so much cheaper to run (inference-wise) and so much more reliable, than having the AI itself perform the image interpretation and clicking.
I see AI desktop use as mainly a consumer product for that reason, since that's the situation where you have to react "on the fly" to whatever the user asks you to do and whatever program happens to be on their computer (versus professional cases which are more large-scale and repetitive, and where you can have a software developer on hand).
That doesn't even begin to cover the lack of actual electricity to power the data centers. We have more "dark silicon" sitting in boxes that aren't close to being deployed, while a lot of actual people can't manage to buy consumer products for anythign resembling reasonable... it's kind of insane to say the least.
Look, I'm a Microsoft hater like the rest of us, but calling Microsoft's products sub-par discredits the author a good bit. I invite anyone who thinks this to try and compete with them. Go after something like Word, for example. Then prepare to be awed by what some of the most brilliant programming minds ever can produce after grinding for four decades.
When I'm using MS Word and it takes 20 seconds to cold launch on a machine that's magnitudes faster than any computers 25 years ago where it launched near instantly, I can tell something is going wrong. When all of their software is harassing me to use AI in ways I don't want to use it, I can tell something is going wrong.
I dont know if you noticed, but there was a shifting of the goal post from "sub-par" to something wrong/sub-optimal.
The best helicopter you can buy may in fact crash into trees sometimes.
markdown have much less of that brilliance and thankfully I also needed none of it.
Last time I authored a word document is probably 2 years ago for a government interaction.
MS Office should last a while if they stop calling it "Copilot 365 Office" or whatever it was.
> AI is here to stay. If used right, chances are it will make us all more productive. That, on the other hand, does not mean it will be a good investment.
The railroad bubble burst in 1846 not because trains were a dead end - passenger number would increase more than 10x in the UK in the following 50 years.
This is high up there on the list of things people say before, you know, it does
Stay competitive how? If the Magnificent 7 aren't spending the money, then how could it possibly hurt OpenAI/Anthropic to not raise equal amounts of money? Maybe you can pull together an explanation, but this author didn't even try to do so.
This piece seems poorly thought-out, but well designed to get shared.
Promote writers who will actually explain their claims carefully.
If it does all go down in flames, even floor value is not going to be that valuable.
I can't predict the future but it's smelling a lot like a recession already under way that is bigger than the sub-prime crash.
They basically decided that scaling at any cost was the way to go. This only works as a strategy if efficiency can’t work, not if you simply haven’t tried. Otherwise, a few breakthroughs and order of magnitude improvements and people are running equivalent models on their desktops, then their laptops, then their phones.
Arguably the costs involved means that our existing hardware and software is simply non viable for what they were and are trying to do, and a few iterations later the money will simply have been wasted. If you consider funnelling everything to nvidia shareholders wasting it, which I do.
You cannot find the efficiency if you haven't been experimenting at scale, this is true personally as well.
If someone haven't been burning a few B tokens per month, everything coming out of their mouth about AI is largely theory. It could be right or wrong, but they don't have the practice to validate what they're talking about.
Not everyone scaling to that degree would have the right answer or outcome, many would be wrong and go bust. But everyone who didn't will not have the right answer.
In the worst of the worst case, they're building know-how of how to manage big datacenters, infra and data-labeling teams. These are incredibly valuable in the next few years. And no, no one, even the AI companies' executives themselves, believe that you can delegate business know-how to LLMs.
Like how the gpt llms were kind of a side project at openai until someone showed how powerful they could be if you threw a lot more parameters at it.
There could be some other architecture in the works that makes gpts look old - first to build and train that new ai will be the winner.
I don't expect hardware prices to go down unless the third option (economic collapse) happens before somebody triggers the dystopia/extinction option.
They aren't all necessarily racing to be "god", some are racing to make sure someone else is not "god".
If it weren't for Altman releasing ChatGPT, it's very likely that we would have markedly less powerful LLMs at our disposal right now. Deepmind and Anthropic were taking incredibly safe and conservative approaches towards transformers, but OAI broke the silent truce and forced a race.
Bottom line is that H100 prices are near 3 year highs, A100s are still profitable to run, B200 prices are increasing, no one has enough compute. Google, OpenAI, Anthropic, Meta, AWS, Azure are all compute constrained. Every single one of them said so publicly. Neo clouds are telling customers they're all sold out now and you even have to book compute in advance if you're an AI company.
OpenAI is struggling to monetize. They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them with the more profitable corporate customers and software engineers.
AI bubble is bursting because OpenAI is trying to monetize free users on ChatGPT with ads but Anthropic is kicking butt in AI. What kind of logic is that? So it seems like AI can be monetized as Anthropic shows. Is AI going to burst because OpenAI can't monetize but Anthropic can? I wouldn’t be surprised at all if in the next couple of quarters we see OpenAI looking for an exit. It will be interesting because the sizes are now so big that we will probably know all the details. The most likely buyer is Microsoft, they already own a lot of it, and because of that, they are the most interested in showing a win.
I'll take the opposite stance. I think OpenAI is going to be bigger than Microsoft in market cap within the next 3 years. I think Anthropic and OpenAI are going to run laps around current big tech except maybe Google. For example, in a few years, I think AI agents could completely replace Microsoft Office, Microsoft's cash cow. Independent reports state that Claude metered models are priced 5x more expensive than their subscribers pay
Already dispelled. It isn't 5x more expensive than their subscribers pay. Inference has a gross margin of 50%+. It's been repeated over and over again by Anthropic CEO, OpenAI CEO, and just about anyone who's done deep analysis on token profitability. If you don't believe OpenAI and Anthropic CEOs, just look at inference providers on Openrouter. They don't have VCs backing them selling tokens at a loss. They should be making margins on every token in order to keep the lights on.Then why aren't the hardware manufacturers of components needed by AI companies making plans yesterday to bring new fabs online to meet demand? That isn't a gotcha question, I genuinely want to know. The money involved isn't that much compared to the money changing hands between Nvidia Microsoft, OpenAI, etc., and it's not like once in-progress data center construction is complete they won't need to buy more RAM and GPUs, especially with any new advances in technology that might happen.
Inevitably someone will reply that hardware manufacturers don't want to be stuck losing money on a facility because the bubble popped and demand disappeared, but if Anthropic and OpenAI are going to "run laps around current big tech", it should be a no-brainer to increase production capacity.
There is one supplier of EUV lithography machines in the world, ASML. They are basically acting as an integrator for hundreds of highly specialized components manufactured to unimaginable levels of precision. Each of them has roughly one eligible supplier in the world who are operating at full capacity. To expand, they'll need yet another set of specialized and almost impossible to build equipment.
So the supply chain moves incredibly slowly, and the slowness is intrinsic due to the complexity and depth of the supply chain. It can't be fixed with just money. IIRC ASML is aiming to merely double their production of EUV lithography machines by 2030.
Once the ability of the supply chain to grow has been saturated, no amount of extra confidence will make it grow faster.
The bottleneck is ASML, who can only make so many EUV machines. No one else can make EUV machines.
Scaling chip fabs and chip equipment is much harder. And you have to understand that chip fabs go bankrupt if demand suddenly drops so they have to be more cautious by default.
How? What do you think lawyers/government will use to write briefs?
that's not what the article said:
> They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them
He said AI is going to bust because OpenAI needs to put ads on free tier. Then he said Anthropic is doing great with enterprise customers.
So which is it? Is AI going to burst because OpenAI needs to put ads on ChatGPT? Or is AI not going to burst because Anthropic is doing great in enterprise?
The logic has glaring flaws.
I am yet to see how a one-legged business model with just a single product (that is not crude oil), without a plan and money is going to become sustainable. Oh yeah, maybe they'll finally make money on those autonomous lethal weapons. That sounds the easiest.
First, OpenAI and Anthropic are the leaders in model capabilities. Google is a close 3rd but 3rd nonetheless.
Second, ChatGPT likely has about 1 billion active users right now. I think ads on ChatGPT will surpass even Google search ads in the future. There will be a class of users who will never pay for ChatGPT subscriptions and that's ok. Meta and Google are two of the most profitable companies in history who almost rely solely on free users for their cash cows. "Ask ChatGPT" is already "google it" for the masses.
Third, there is so much untapped revenue potential from science, medicine field that OpenAI can eventually own with Anthropic. Microsoft stands no chance here since they can't build competing models.
Fourth, I can easily see ChatGPT morphing into agents for consumers and people will pay for them. AI is moving up the value chain fast. I don't see any reason why consumers won't pay for ChatGPT but will pay for Netflix.
Just some basic ideas based on public knowledge. I'm sure there are plenty more.
I'm not going to bet my house that OpenAI will become bigger than Microsoft in 3 years, but I'll put down a few hundred dollars on this bet.
Internet Explorer being pre-installed on Windows devices didn't prevent it from being demolished by newcomer Chrome throughout the 2010s. Now we're looking at a product that's even less integrated, and whose value is exposed through universal interfaces (human language, images, etc.).
If OpenAI succeeds, I imagine that remarkably little of it will have come from the brand. But subtracting the first-mover brand advantage: they can either compete on the frontier, which seems difficult and bears potentially diminishing returns (particularly wrt to distillation); or compete as a commodity, which I imagine cannot justify their valuation/spend.
It seems very uphill of a battle.
Except the investment is more like a railway or utility. It generates like 3% return, which is definitely not good enough for the people providing the money, or (in the case of the profitable companies) anywhere near the double-digit returns they make on their technology products. I won't be surprised when we see consolidation of marginal players and abandonment of the losers, just like you can find rail lines to nowhere, and fiber that's never been used.
With a gaming GPU you can run Qwen3.5-35B-A3B. I use 122B-A10B on my local rig (1x6000 Pro), and 397B-A17B on my 2x6000 Pro server (some spillover into CPU/RAM). It's pricey now but probably within a few years it'll become very affordable.
> Anthropic is already in a push to reduce costs and increase revenue
Yeah, it's totally a bad sign when a company tries to... reduce costs and increase revenue.
Usually in a land grab like this you spend, spend, spend.
Uber was still paying to subsidize customer's rides until fairly recently to kill off the competition.
When AI companies look to cut cost: a sign of bubble bursting.
When RAM price goes up: a sign of bubble bursting.
When RAM price goes down: a sign of bubble bursting.
I do not see this talked about often enough whilst everyone is in the process of introducing hard dependencies on these services into their workflows.
The tragedy is when it's all over one of the surviving passengers will go "See! I knew we were going to crash because of that knitter"
But those things are tied together.
Even xAI, that now has a reasonably competitive model, is struggling to achieve PMF. Meta is in shambles because their models have underperformed for years now.
I do hope that RAM prices come down but this was just wishful thinking.
The interesting questions are: "What triggers it" and "what also goes tits up"?
The issue with high/international finance is that a good percentage of it (if not more) is fraudulent or semi fraudulent bollocks.
"Here is a startup that is worth x million because y" Both of those statements are bollocks. However its in the interest of most people to agree with that bollocks to get money. If enough money is given there is a chance that the startup will make money.
If we look a few year back, NFTs fulfil that niche quite nicely. It was obviously bollocks, but a very convenient way to launder money, or run a series of rugpull operations.
The problem we have to contend with now is that the sheer amount money that has been invested all disappearing at once would require 2007/8 levels of coordination to unfuck. The US government does not have the requisite number of admins to pull that off again, and no political will to ever have that expertise again. So if AI does go pop, and it takes a lot of money with it, I would put a guess on china doing the money lubrication and extracting a subtle but richly ironic level of control in exchange
Also, its no guarantee that AI will trigger the next bubble popping, my money is on Private Equity.
That's like saying "I know exactly how you're going to die, your heart will stop"
Have you tried Gemini 3.1 lately? It is not even close to Opus 4.6 never mind Claude 5.
This post, like many pessimistic takes, seriously discounts innovation and the exponential takeoff of recursive self-improvement.
Currently a lot of that appears to be marketing hype to drive up usage. Is it exponential, or are the labs spending exponentially more for smaller and smaller gains from LLMs?
I don't think Sora ever thought of as a "revenue driver" considering how notoriously expensive and unpredictable video generation via inference is. OpenAI is just a repeat of Uber—minus the scandals—in a different decade. Uber got itself into tons of businesses related to transportation on the assumption that it would all be viable "one day." Same stuff that OpenAI is going.
I would say, once the bubble bursts—which is likely, considering the geopolitical environment—OpenAI, Anthropic, and Alphabet are likely to be the winners, with a lot of small players at the tail end. Anthropic won over programmers and OpenAI on everyone else. For millions of people, AI = ChatGPT, so I would bet that OpenAI can still become profitable, once they cut down their expenses.
Given the tech bros involved, we just don't know about them yet. Also was this comment generated using AI? Look at all the em dashes.
My guess is that cloud companies will scoop up the data centers for pennies on the dollar and the GPUs get written off or fire-sold to enthusiasts still wanting to run local models. Then they can offer exceptionally low initial prices to new customers and get more people to be locked in. Or maybe we see a couple of new cloud companies start up but that would likely need lower interest rates.
The thing that is difference is the scale and the hardware. When Britain underwent its rail building boom in the 1850s, the bubble bursting left the kingdom with 150 years worth of infrastructure. Unless we invest in energy buildouts, we will be left with billions in rapidly depreciating GPUs
However, the core utility of the best AI (read: Anthropic's ATM, by miles), will still exist and be leveraged by those who have learned to use it well.
I could also see the exponentially declining power requirements offsetting the exponential-but-slower rate of AI compute demand, which then renders a lot of unused capacity in these massive data centers.
I think of it like the old mainframes in the 70s which would take an entire city block to run, and now we have the equivalent of millions, if not billions of them in our pockets.
In general AI is very much like human intelligence in the regard that no two models are the same just like no two people are the same. IOW if you are a single model shop you might even not have any idea that you’re falling behind.
I think this is a good comparison to current AI.
billions of them in our pockets.
AI in your pocket (but first on the desktop) is a real possibility.
This bodes well for us being at a point that even if the bubble burst, we'd still have usable AI going forward.
About 2 months ago this place was unbearable - filled with doom and hype AI posts. I welcome the calming and eventual slow release of the bubble.
By which I mean the competent organizations are the ones that will come up with cultural and technical solutions to manage the quantity and quality of the code better.
Others will suffer severe quality issues. Not because the "AI"s produce inherently inferior code but because the volume of the code is too high to manage review of, and to have good internal organizational knowledge of to manage the pages in the middle of the night when servers go down because of code nobody really understood.
I produce masses of independent project work all day long in my spare time using these tools and they blow me away. But in the context of professional work on teams of other coworkers the results are difficult to reason about and often impossible to competently review and it's not clear the results are superior. ' IMHO companies that drink too deep from the well without caution could be burned badly.
Aside:
I hate to say it, but there is no sense in which Anthropic has the clearly better product than OpenAI at this point. I know Claude caught developer's hearts through the fall, but GPT5.4 is a more powerful, careful, and competent model for coding and Codex is a far less buggy and more performant TUI. For the last 3 months I've gone back and forth between the two and I always run anything written by Claude Opus 4.6 by myself and my coworkers through Codex for review and it is constantly finding severe correctness issues to the point where I simply won't subscribe to Anthropic's product anymore.
On top of that, OpenAI provides far higher token limits. Even their $20 plan goes quite far.
If I was just building crud websites, probably Claude Code would be fine, and it does indeed show more "initiative" and "imagination" but I've seen it build way too many race conditions and correctness issues to trust it or the work my coworkers make with it.
I just dont understand why it justifies so much spending!
Back to the mines. The Vulkan only writes itself when prompted with well-conditioned problem statements.
See, they kind of became a national asset and letting it go down, will leave USA watching China taking the lead for a very long time ahead. It just can't happen - right? So we'll just all fund it in taxes.
> checks list ...
nope, nothing will either directly or indirectly affect me. Let it happen sooner, rather than later, and unleash the mobs at the tech bros that set the world on course to make everybody's life more miserable. We'll still be here to get the scrapped RAM and GPUs to train and infere local models thank you very much.
People continue to work, some proportion of the those working use LLM’s regularly.
Enough time has passed that subjective statements about the future don’t pass muster. Look at the numbers - there has been no large scale lay offs since correcting for over hiring. Has hiring slowed down? Sure. However I’d wager most firms are finding it pretty difficult to think of projects to take that will generate positive NPV. If that’s the case why would they hire? Moreover the focus has returned to cash flows - not product based growth metrics. Which again re-inforces the point about project selection.
Efficiency generated growth does not continue on forever - it’s short lived.
Let me remind you that you are not paying the full price for the service and all the value of those company is out of thin air. More or less the premise of the article. *when* you will be asked the real price, we'll see if the company will prefer a human or a bot it can't pass blame to
AI on hardware you own and control --- instead of a metered service provider. In other words, a repeat of the "personal computing" revolution but this time focused on AI.
TurboQuant could be a key step in this direction.
And people would prefer to run a model locally for 'free' (not counting the energy cost) rather than paying for an LLM subscription.
Ding, ding, ding --- we have a winner.
https://techstartups.com/2026/03/26/nvidia-backed-ai-startup...
Cynicism makes you sound smart. Optimism makes you successful.
The cynicism around this technology is everywhere, even though it clearly has real power to solve problems. It is a technology which enables so many use cases that were impossible before, that makes it very highly hyped/expected. And that is causing an immune (over) reaction by natural skeptics, that's an error.
People need to take a measured, reality based, view of how the technology is being used today, the adoption curve, and the increase in capabilities over time.
It's clearly being used strongly, and may even be revolutionary.
Bubbles burst when there's no 'there' there. AI has an undeniable 'there'—the only question is the timing of the ROI.
At this rate, I’d almost prefer to talk on a private mailing list with vetted resumes.
"No longer?" It never was.
Especially with AI boosters being allowed to degrade the comments section and shilling their paid blogs and violating the HN guidelines.
A bubble doesn't necessarily mean that the the underlying tech/innovation isn't useful. It's a financial and economic phenomenon that is pretty well understood and researched:
- During the hype cycle, investors tend to overestimate the short to mid term effects and underestimate the long term effects.
- It's near impossible to pick the winners in advance, and research has shown that investors underestimate how many losers there will be.
- The financial system/market works very well when there are localized issues with debt. Those get seemingly automatically detected and repaired. But broad increases in credit not so much. Those spread into the whole system in non-obvious and complex ways and destabilize the whole system, which can lead to very large corrections.
etc.
Personally I'd say that it's a problem that prices of consumer goods go up that far to satisfy this part of the market. We could need a more sensible way to advance the technology.
In my opinion this is incomparable to what we are seeing with agentic AI that is rapidly replacing handwriting code.
I figure chances are AI is not going to stop here.
- AI is a genuinely transformative technology on par with the internet and on track to probably surpass the smartphone
- The inflated valuations, the circular flows of money (or "money"), and the financial cup-shell game mean that the players of the game are all a few bad weeks away from catastrophe. This is, of course, nothing new for SV -- but the scale this time is new. Some believe it will soon collapse -- "bubble," thus.
The question is when will the frontier AI companies turn a profit on said transformative technology since other than NVIDIA and big tech, it is losing them tens of billions and who will survive a crash when it comes.
This is when you know you are in a bubble when people with a clear financial incentive are going on to newsletters, podcasts and posting extremely outlandish predictions to sell the public on something.
The amount of engineers becoming snake-oil salesmen and vibe-coders becoming cybersecurity experts overnight selling AI courses is a good indicator which I am looking at.
This is good. It's how you know they lacked the intellectual rigour required to be engineers in the first place and thus never were.
Things look different from within a bubble, you need an outside perspective.