We keep seeing estimates like this repeated by AI companies and such. There is something that really irks me with it though, which is that it assumes companies that are replacing labor with LLMs are willing to pay as much as (or at least a significant fraction of) the labor costs they are replacing.
In practice, I haven't seen that to be true anywhere. If Claude Code (for example) can replace 30% of a developers job, you would expect companies to be willing to pay tens of thousands of dollars per seat for it. Anecdotally at $WORK, we get nickel and dimed on dev tools (better for AI tools somewhat). I don't expect corporate to suddenly accept to pay Anthropic 50k$ per developer even if they can lay off 1/3 of us. Will anyone pay enough to realize the capture of "trillion dollar"?
LLMs are, ultimately, software. And we’ve had plenty of advances in software. None of them are priced at the market value of the labor they save. That’s just not how the economics work.
If developers which complain today that $200/month is too much will stop using it or start paying $2000/month.
These numbers actually look really good. OpenAI's revenue has been increasing 10x year-on-year since 2023, which means spending even 3x the amount of resources to produce another model next year would likely generate a healthy profit. The newer models are more efficient so inference costs tend to decrease as well.
As long as models can keep improving, and crucially if that depends on the amount of compute you put in them, OpenAI and other closed-source AI companies will succeed. If those two key assumptions stop being true, I can definitely see the whole castle of cards crumbling as open source competition will eat their lunch.
Barring solid evidence otherwise you would think that GPT 5.2 was built largely on GPT 5, enough that possibly the majority of the cost of 5.2 was in developing GPT 5.
It would be like if you shipped something v1.0 on day one and discovered a bug and shipped something v1.01 the next day. Then at the end of the year reported that v1.0 massively lost money but you wouldn't believe the profit we made on v1.01 it was the single largest return on a single day of development we've ever seen.
Given the scale of AI lab burn rates being so high that the AI Capex shows up in nationwide economic stats, it clearly cannot burn for that long.
So what happens first - labs figure out how to get compute costs down by an order of magnitude, they add enough value to raise prices an order of magnitude (Uber), or some labs begin imploding?
Keep in mind another aspect of the comparison is that there wasn't an entire supply chain spend effect triggered by Uber. That is - you didn't have new car companies building new factories to produce 10x more cars, new roads being built, new gas stations being built, etc the way you have for AI.
It's like the entire economy has taken 1 giant correlated bet here.
I think the answer lies in the "we actually care a lot about that 1% (which is actually a lot more than 1%)".
If we consider what typically happens with other technologies, we would expect open models to match others on general intelligence benchmarks in time. Sort of like how every brand of battery-powered drill you find at the store is very similar, despite being head and shoulders better than the best drill from 25 years ago.
Yes, as long as that gap stays consistent, there is no problem with building on ~9 months old tech from a business perspective. Heck, many companies are lagging behind tech advancements by decades and are doing fine.
They all get made in China, mostly all in the same facilities. Designs tend to converge under such conditions. Especially since design is not open loop - you talk to the supplier that will make your drill and the supplier might communicate how they already make drills for others.
Purely according to Artificial Analysis, Kimi K2.5 is rather competitive in regard to pure output quality, agentic evals are also close to or beating US made frontier models and, lest we forget, the model is far more affordable than said competitors, to a point where it is frankly silly that we are actually comparing them.
For what it's worth, of the models I have been able to test as of yet, when focusing purely on raw performance (meaning solely task adherence, output quality and agentic capabilities; so discounting price, speed, hosting flexibility), I have personally found the prior Kimi K2 Thinking model to be overall more usable and reliable than Gemini 3 Pro and Flash. Purely on output quality in very specific coding tasks, Opus 4.5 was in my testing leaps and bounds superior of both the Gemini models and K2 Thinking however, though task adherence was surprisingly less reliable than Haiku 4.5 or K2 Thinking.
Being many times more expensive and in some cases less reliably adhering to tasks, I really cannot say that Opus 4.5 is superior or Kimi K2 Thinking is inferior here. The latter is certainly better in my specific usage than any Gemini model and again, I haven't yet gone through this with K2.5. I try not to just presume from the outset that K2.5 is better than K2 Thinking, though even if K2.5 remains at the same level of quality and reliability, just with multi modal input, that'd make the model very competitive.
It is highly dependent on what the best represents.
If you had a 100% chance of not breaking your arm on any given day, what kind of value would you place on it over a 99% chance on any given day. I would imagine it to be pretty high.
The top models are not perfect, so they don't really represent 100% of anything on any scale.
If the best you could do is have a 99% chance of not breaking your arm on any given day, then perhaps you might be more stoic about something that is 99% of 99% Which is close enough to 98% that you are 'only' going to double the number of broken arms you get in a year.
I suspect using AI in production will be a calculated more as likelihood of pain than increased widgets per hour. Recovery from disaster can eat any productivity gains easily.
But if it's just 33% as good, I wouldn't bother.
Top LLMs have passed a usability threshold in the past few months. I haven't had the feeling the open models (from any country) have passed it as well.
When they do, we'll have a realistic option of using the best and the most expensive vs the good and cheap. That will be great.
Maybe in 2026.
Their cost is not real.
Plus you have things like MCP or agents that are mostly being spearheaded by companies like Anthropic. So if it is "the future" and you believe in it, then you should pay a premium to spearhead it.
You want to bet on the first Boeing not the cheapest copy of a Wright brother plane.
(Full disclosure, I dont think its the future and I think we are over leveraging on AI to a degree that is, no pun intended, misanthropic)
How do you even do that? You can train on glorified chat logs from an expensive model, but that's hardly the same thing. "Model extraction" is ludicrously inefficient.
So what ?
The long view is to see the microcontroller as a commodity piece of hardware that is rapidly changing. Now is not the time to go all in on betamax and take 10 years leases on physical blockbuster stores when streaming is 2 weeks away.
Ai is possibly the most open technological advance I have experienced - there is no excuse, this time, for skilled operators to be stuck for decades with AWS or some other propriety blend of vendor lock-in.
So really, the argument pretty well makes itself in favour of the $0.5 micro controller.
[1]: https://finance.yahoo.com/news/nvidia-accused-trying-cut-dea...
[2]: https://arstechnica.com/tech-policy/2025/12/openai-desperate...
[3]: https://www.businessinsider.com/anthropic-cut-pirated-millio...
there are pretty good indications that the american llms have been trained on top of stolen data
This works with every novel I've tried so far in Gemini 3.
My actual prompt was a bit more convoluted than this (involving translation) so you may need to experiment a bit.
They can’t even officially account for any nvidia gpus they managed to buy outside the official channels.
https://epoch.ai/gradient-updates/can-ai-companies-become-pr...
TL;DR the prices have gone down (to achieve same benchmarks) by more 9x to 400x.
This should clearly tell you that the margins are high. It would be absurd for OpenAI to be constantly be under a loss when the prices have gone down by ~50x on average. Instead of being ~50x cheaper, couldn't OpenAI be like 45x cheaper and be in profit? What's the difference?
I genuinely don't know why you need any more proof than just this statistic?
If you can compare phones or pcs, there was a time when each new version was a huge upgrade to the last version, but eventually these ai models are gonna mature and something else is gonna happen
Frankly, it's real slop.
I think you are right that trust and operational certainty justifies significant premiums. It would be great if trust and operational certainty were available.
That's why OpenAI tries to push Assistants API, Agents SDK and ChatGPT Apps which are more of a lock in: https://senkorasic.com/articles/openai-product-strategy-2025
Funny thing is, even OpenAI seems to ignore Assistant/Apps API internally. Codex (cli) uses Responses API.