This means we're going to need $1t+ per year in spending, per year, on tokens. 200m knowledge workers in the world, 30m developers. We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.
That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.
We're not there yet. This is still the upswing of the hype cycle, and unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.
- The publicly available information about how inference costs compare to training costs is conflicted. EEs involved in datacenters talk about power usage spikes during training runs as if they were a major factor in the designs, but academic papers discussing cost-optimal scaling confidently treat inference-time compute as a major factor.
- On the side of the balance indicating that training is more compute-intensive after amortization than inference is that Chinese providers, constrained primarily by access to compute, have nearly unlimited token availability at a lower price than US providers (inference), but poorer model capabilities (training). That would make sense only if US providers are inflating inference costs by 20-30x due to amortized training costs that overseas providers were not able to take on (there are other factors too).
- If training >> inference, they're in a prisoner's dilemma that far exceeds the ordinary zero-marginals model of competition between firms (due to its huge discrete stepwise nature). On the other hand, if inference>>training, the high-level analysis popularized by certain thought leaders, that it's like a utility, would be true. You'd tend to count this as a vote for inference>>training, but the CEOs saying it at least have a huge incentive to agree because the alternative, the prisoner's dilemma, would stop investment very fast.
- The only voice in the story that I just told you to have anything to do with fact (as opposed to high-level analysis and ivory tower armchair management of a secretive business) were the rumors from facilities engineers. That shows you the state of our understanding...
- If we don't even know the ratio between amortized capital expenses and operational costs, outside investor analysis is impossible. It doesn't matter how finely they divide the accounting buckets for office ferns and indoor ferns if the single biggest part of their business is obscured for trade secret reasons.
Yes I know there's no evidence and this is lazy reasoning. But there's probably a bit of truth to this line of thought.
Also, inference costs are bound to go way down with more optimized architectures. GPUs are fundamentally not great at inference. No platform where the weights are streamed from a large pool of memory is. If the models ever quiet down, there will be massive step changes in cost/token, energy/token and tokens/second, as models are etched into silicon ala https://chatjimmy.ai/
You have to keep in mind that about 99% of their announcements are targeted towards investors (their most important revenue source..), so they're not going to be afraid to mention metrics that make the business look better.
The short and only kind of wrong version is:
In the US, companies are not allowed to unfairly privilege some investors over others by giving them access to secret information that would let them judge the future prospects of the company. (Except in all the ways they can, but these usually involve some kinds of insider trading rules.) Private companies can handle giving out secrets to investors by literally writing and memo and mailing it to all their investors, if they want to give out some secrets to one of them.
Public companies cannot do that, even if they knew who all their investors were, but must instead consider every member of the public a potential investor, even if they don't already own the stock. Because of this, when public companies want to reveal material information about their future prospects, they must reveal it to everyone.
The 1 they were missing is that AI requires both training and inference, and training is by far the expensive part. And that in principle you can stop training at any point and keep using the models as they are. (But that means that if other companies keep improving their models, you'll be left behind...)
In contrast, inference is fairly cheap and all the providers have great margins on it. Eventually either investment in training stops having commensurate impact on model quality, and people stop doing that and instead concentrate on making inference faster and even more efficient. Or if that doesn't happen, things will get very weird very quickly.
Show us your work, then. If it's so easy to do, this should be a trivial request to accommodate, no?
Kimi 2.6 is a 1 trillion total / 32B active parameter model that's something comparable to Sonnet. Sonnet's API pricing is $5 in, $15 out per million tokens. Deepinfra serves Kimi at $0.75 in, $3.50 out, and about the same at openrouter. So you're looking at a 4-7x multiple that Anthropic is charging compared to market rates that any plebe can get with a credit card.
Speaking to your point, inference being dramatically less costly than training would not be seen as a delta from the norm. The model of providing inference for anything near the operational costs (like a utility would), would the delta from the norm if it were true.
Training is also done over batches, which increase memory requirements by several orders of magnitude. This is why training needs costly compute.
One of the ways out of this unfortunate situation is to use something like Stochastic Average Gradient Descent [1]. Examples there are mostly concerned with regularized logistic regression, which makes problem more or less convex. Neural networks are inherently non-convex. Still, maybe some ideas from there can be utilized in the context of neural networks, like use of estimated Lipshitz constant to derive curvature and appropriate learning step.
[1] https://www.cs.ubc.ca/~schmidtm/Courses/540-W19/L12.pdfTraining is inference + backwards pass (~2x inference cost) + activations (vram overhead) + optimizer (vram overhead) + gradients (vram overhead).
And last, but not least, you need only one hidden layer kept in RAM for inference, but you need all of them (61 for Deepseek models) kept in RAM for computing gradient for one sample.
Batch size is frequently limited by compute bottlenecks well before memory.
We are still chasing the best because the best is moving rapidly, but it’s a simple thought experiment to work out what the cost to serve an 8B model from 2 years ago is in a world of 2T models.
Note: parameter counts are illustrative. Concretely, qwen3.6 27B delivers opus 4.5 capability at 1/27th the cost on openrouter. Single chip llama3 8b performance can exceed 17k tokens/sec.
* At some point model capability reaches diminishing returns. Then inference >> training in the future but training >> inference now. It’s not a prisoner’s dilemma but a land grab to solidify market position and be one of the 2-3 firms left standing as dominant in the space. The model companies aren’t super sticky yet but they’re working on it.
* even if training remains >> inference, it’s possible to have multiple price points like they do today. If you need the most capable model you’ll be paying exponentially more per token to supplement the training cost even though the serving cost is marginal because most people will be satisfied with cheaper / less capable models for most tasks.
I buy that inference is a dropping line item while training is a growing one. There’s all sorts of things on the horizon that’ll be order of magnitudes improvements, from startups burning models into ASICs to get order of magnitudes more performance to alternate architectures like diffusion transformers that have orders of magnitude structural optimizations. It’s inevitable that it’ll come down even further from where we are. It’s possible model training also will go down but I’ve not seen any compelling research suggesting major “easy” reductions here.
So one possible future is that frontier-level training becomes so expensive and the use cases so sparse that it simply isn’t viable to keep going bigger.
Maybe investors will realise that "the only winning move is not to play".
And so we are left with (as was) frontier models getting more and more out of date as whoever their post bankruptcy custodians are tries to eek pennies on the dollar for inference on their decaying property. Perhaps along with local and/or highly specialized models still feeding on the after-glow of the huge amount of training that was (and is no longer) done.
The next AI winter is going to be deep, savage, and long.
Why are they getting out of date? Is it because we have new content from the internet that the older models did not have? Or are we simply trying to increase the size of the training data? In other words not more up-todate in terms of time the content was created vs. wanting to use bigger training-input-sets?
In general, I don't think you can reason from the existence of potentially stranded investments back to revenue projections.
And when you frame this as percentage of salaries that's a sneaky implication that this is only about reducing salaries and headcount, and not about adding capability or competing more effectively.
That said, 5% of knowledge worker comp actually seems very low to me, given the capabilities.
Even 20% of developer comp seems like an incredible bargain. From what I'm seeing in my own work, the value is probably closer to 100% of developer comp so if the cost is only 20% that's way underpriced.
Our estimated spend for AIaaS would exceed that cost in less than a year.
In a few years, there will be hardware capable of running frontier models good enough for most things at accessible prices for even tiny companies.
If open source models are ~3-6 months behind SOTA, and ~opus4.6 capabilities are good-enough for product market fit, do the frontier labs have half a decade to catch up on their prior burn?
AI cost ballooning faster than companies can afford is becoming a very common topic in my circles right now. The era of "I'll pay infinitely more for marginal gains" is over from what I can tell.
That's doing a lot of work here.
The future I see isn't most companies buying hundreds of thousands in hardware to run models, it's them adding a line item to their AWS bill. Inference costs on the larger hosted open source models are dramatically lower than the frontier labs API pricing.
The days of requiring a data center to run anything resembling opus 4.6 are already counted. (But the industry will fight hard to get people to keep paying the Claude tax.)
And yeah, that may be the ~decade world, but we're in the mainframe era of the frontier models. It's going to be more economical for basically any consumer, and most businesses, to pay someone else to host models for quite a while.
Said model will also run as a tool-calling coding model excellently (it's no Opus, but for a thing that once set up is just the cost of energy, it's incredible). It can type faster than you can, probably 10x faster, so with guidance it'll make you faster. And it's free.
It's here. If folks want ChatGPT without a subscription, they can have it today on their computer. The only money to be made is in the high end models doing "serious business" work spanning 1M+ token contexts and massive uncertainty. Everything else is already set to be eaten by today's local models.
Here's a prompt I just ran against Claude Opus 4.7:
> Use python3 to experiment with whether the SQLite3 authorizer mechanism can be used to detect an INSERT OR REPLACE based just on running an explain query without examining the SQL string itself
Opus nailed it: https://claude.ai/share/c4212606-3fee-4b7c-bc97-505e0348ccac
I tried the same thing against qwen/qwen3.5-35b-a3b running locally in lmstudio, with the Pi coding agent. At first it looked like it was going to do great! And then it fell apart over the course of several tool calls: https://gisthost.github.io/?8ae2f842df619fb7fd8f1ccd82fe41c7
I'm used to GPT-5.5 and Opus 4.7 handling that kind of prompt without any problems at all.
I bet this will ironically be couched in "safety" reasons or regulation to get anti-AI folks on board, even if it favors the large incumbents.
That's the future Amazon sees too. We just had a week long session with the AWS team and they pushed that to us multiple times.
Claude code was a lot of people's introduction to using coding agents that could do a lot more than copy-pasting from a chatbot or autocomplete.
Opus 4.6 quality for local inference would be revolutionary.
The goalpost we've been bludgeoned with over and over again is that, in particular, Everything Changed in November 2025. That GPT 5.2 and Claude 4.5 were the inflection point. That is actually 6 months ago. And DeepSeek 4 is already there.
> run locally
You can't run DeepSeek locally on consumer hardware[1], but you can on enterprise hardware, and enterprise spend is the subject of this conversation -- and even if you aren't self-hosting, it doesn't matter, because you can just get your inference from one of the the many companies serving DeepSeek, who trivially undercut the pricing of OpenAI/Anthropic because they didn't have to spend hundreds of billions on training frontier from scratch but instead only invest in supporting inference, which is already profitable.
[1] Since this misconception comes up all the time, I'll go ahead and pre-empt it: no, training a 32b parameter model on outputs from DeepSeek and running that locally is not "running DeepSeek", despite the hundreds of stupid articles and Youtube videos making that idiotic claim that they're running it on a 5090.
Maybe not DeepSeek v4 Pro, but I've run DeepSeek v4 Flash on my 128GB MacBook Pro using antirez's carefully quantized https://github.com/antirez/ds4 and it's impressive.
And 5% worse model for 10% of the price of the bleeding edge will be worth it for majority of people
The larger point I'm making is I think models are rapidly becoming commoditized. There is probably a small market long term that's willing to pay 10x for 10% marginal gains, but the majority of the buyers in the market will be economic and we're likely to have a lot of folks willing to spend 1/10 the cost for 90% of the performance, and plenty of companies that haven't raised hundreds of billions-trillions who can provide that.
A lot of the frontier labs valuations has been based on an assumption that 1-2 companies would get break-away intelligence that basically made them economic chokepoints indefinitely into the future. The reality that's becoming increasingly clear is that model quality is a pretty linear function of (cash burned - ability to copy other's homework) and the economics are starting to look a lot more like airlines than online advertising.
The economics of airlines are such that they generally earn a return on capital less than cost of capital.
I think this is exactly where we are heading and OAI-Anthropic are the concordes.
Your argument rests on the "for marginal gains" part but it's really not clear that the gains are marginal in the foreseeable future.
We're 3.5 years into this current AI wave, and a lot of the valuations have been predicated on what you're arguing here -- that essentially should one of the labs make an order-of-magnitude improvement or hit escape velocity on recursive self-improvement they'd become the most powerful economic chokepoint in history.
The reality has been that given access to compute + capital all of the labs can stay pretty competitive with each other. Someone does a bit better on coding, someone else does a bit better on tool calling, and then they swap after each spending another $100bn.
The market looks like a commodity market where the commodity is intelligence, not a winner-take-all market with massive margins. Plenty of people get rich in oil and airlines, but they notably don't tend to be the innovators long term, they tend to be the operators. Obviously if the machines become sentient tomorrow, turn on their masters, and hit world-dominating intelligence, that assessment changes, but after several years of that narrative while objective reality looks quite different I think the more sober voices are starting to gain a foothold.
I remember that even when GPT-4 was king, the Gorilla paper showed that Llama 7B could be fine-tuned to outperform GPT-4 on tool calling.
On domains that don’t involve agentic tool calling*, I haven’t found the frontier to have advanced that much.
Edit: I should broaden this to domains that naturally lend themselves to RLVR training. Models are drastically better at math now.
Just think how much further that $100K would have gone if the hardware market wasn't so screwed-up.
Anecdote: I priced-out adding 1TB of RAM to a four node cluster a couple months ago. The cluster was purchased in fall of 2024 w/ 4 nodes, each with 256GB RAM. The nodes cost just over $14K apiece back in 2024 (entire box, not just the RAM).
Dell wanted >$90K a couple months ago to add 256GB to each node.
RAM is expensive, but not THAT expensive. I just bought 128Gb for about $5k for our build cluster (it's not even for AI, sigh). Even if you need larger-sized DIMM sticks, it's still going to be in the vicinity of ~15k tops.
I haven't had problems w/ Dell support and 3rd party memory, personally, but given the machines' application I understood the concern.
The Gemini Flash is very good at searches. Just about any low end model can toss out a poem. All the higher end models (open source and otherwise) seem to be able to churn out code that passes tests. The smaller, "less capable" ones are much faster at it, which means in the hands of a skilled practitioner are the best choice for that task. But they rapidly fall apart where there isn't a hard source of truth (like a good test suite) to grind against. Because of that you have to use a bigger model for bug finding. In that task the open source models tend to fail on larger code bases, where something like Opus still shines. I gather Mythos is an absolute monster, and unparalleled, and unavailable. I'm sure one of the reasons for that is it's so expensive to run.
Or to put it another way - you don't use a 100 tonne crane to pick up the shopping. And ... the smaller models will happily run on in-house hardware. You may not do it today because of the current DRAM price and integrated NPUs have just started shipping, but in 5 years time models will be running on your phone.
AFAIK you would get about ~5 concurrent users, with a max context window of ~128K tokens on the larger models.
This wouldn't be good enough for coding -- are you guys thinking of using it for something else?
The decadal move to all-cloud-all-the-time killed off in-house hardware teams while the C-suite chased their OpEx dreams.
It would be interesting if we come full circle on this.
What makes you so confident about this prediction? Hardware costs haven't exactly been cratering recently.
No, but local models have been booming in performance/quality improvements. The RAM shortage won't last forever (more supply will come online when if demand doesn't diminish), and then the math would be pretty easy.
My single spark has me running Qwen 3.6 27B and antirez’s specially quantised DeepSeek v4 Flash (which is shockingly impressive)
What you call harnesses I call… bullshit?
I was going to say - the models are just going to keep growing at a pace exceeding the pace of hardware pricing/availability
But then I realised that, far more likely, there will be a plateau reached (again) where nobody is seeing gain, and at that point hardware will catch up
"As cable TV and Pay Per View came out, there were studies done about how many movies people would watch if given unlimited access to films. The results were bandied about as proof that we should build out all this infrastructure to support this line of business. When the data was further analyzed by statisticians etc, it turned out that people claimed they were going to watch films 10-12 hours a day, every day of the week. Impossible."
I feel like we are in a similar boat here where some people are assuming:
- EVERYONE is going to be using max tokens
- tokens will NEVER get cheaper due to improvements in hardware, software, design, market forces etc etc
Surely we could just put better stuff on the radio, and accomplish most of the same goals for a far lower price?
None of them had the Pirates game.
I was thinking how the transistor radio was a far superior experience for this use case. Just tune to the channel broadcasting the game.
Then there’s NTS, BBC… Ypu can listen to them from online service, but at least in Europe there’s amazing national FM broadcastimg services.
TV is just bad radio with flickerimg lights.
anthropic already hunts down OpenClaw users for using too much on their plan.
I'll give different example: When LED lights started to be more popular, the power usage didn't drop by the amount of power saved
>- tokens will NEVER get cheaper due to improvements in hardware, software, design, market forces etc etc
Well, first, improvements in computing stalled or even rolled back just purely because price of everything compute shot up cos of AI and that will NOT be fixed for a while and ESPECIALLY if AI usage will continue to increase
Second, the token per model might go down in time but better models have more expensive tokens, so we quickly get into spot when:
* price increase in token might not be worth marginal improvement next, better model brings
* more and more models are passing "good enough for the task" threshold so for less and less companies there is any economic sense to pay for the "best" instead of paying deepseek or some other company to run "previous gen" models
That's the game. There's a view you could take of this that this is just a growing of the pie: with those cost dynamics a lot more "small businesses" get a vast amount of leverage, so the overall economy grows without replacing the knowledge workers. I'm not sure I trust the MBA class to have that view.
I would argue that that's been the case for quite some time before AI. As an example, what innovative amazing world-changing products have Google or Meta launched in the past decade with their very high numbers of very talented and highly-compensated engineers? The issue with most big tech companies are leadership, strategy, and product direction. I'm not saying that they don't make any profits, just that they probably aren't "building [the right thing]".
AI for product development and management would be far more impactful than automating rote coding tasks / building React UIs that mirror API structures IMO.
Yeah, if this stuff actually worked that well already, OpenAI et al. would just run AI CEOs and engineers. Why get some other company to pay you at all when you can automate every other company out of existence and take all the money they make?
The fact of the matter is that while the tech has some uses, it sure as hell isn't a full scale replacement and you almost always actually have to massage the input into LLMs to get anything decent back out in practice. Some CEOs and managers can learn to do this, of course, and some already are... but that quickly turns into a second full time job. A "programmer" is still needed. The job might change from mostly hand-writing C++/JS/Python to prompt engineering + some manual coding to fix all the stupid fuck-ups that the bots can't solve themselves, but you still need someone to actually prompt the bot.
When that changes, it won't just be engineers losing work; there will be no reason to even have a human CEO any more.
I don't think there is any shortage of great ideas at these companies, they are just extremely bloated. And I don't think its something like indecision or bad PMs, it's "we have a finite amount of time and resources so we need to be conservative but also not too conservative"
If you have AI systems that can simply build out POCs in days, backtest on real data, show reliable results and numbers, you get a suite of product options you were never able to get before. If you have coding agents that can speed up implementation, you can build more stuff and choose the things that stick.
It changes the cost/benefit calculus of the entire business. I think you are exactly right in that: PMs/leadership are by their nature orchestration machines. Other roles are as well, but I think PM's are at a particular advantage here in that it will be quite awhile I would expect before core product decisions and creativity can be delegated to an AI, but not quite awhile until virtually everything that they're blocked on (legal approvals, POCs, wire frames, etc etc etc) will become less and less of a blocker
I'll also add this: within a large organization, you often need to interact with many different codebases owned by many different teams. Agents have made it much easier to wrangle by having the ability to deploy one to scope out your web of dependencies to learn about what would be needed for feature X, and how that integration can happen.
We've been doing far more away team work simply because it makes things move faster. It's easier to convince a team to sign off/review something than it is to get them to commit to the planning and eventual work.
It genuinely is helping things move faster inside large organizations. Or at least, it is for us, particularly since we're getting organizational prioritization to actually build the scaffolding to make those agents more effective at search.
1000x yes: you have touched on what I think is the single biggest factor here, that is the humongous value of POCs. they are gnarly to build without agents, and so we used to have to get everyone on board so we didn't get screwed in performance reviews, which was monumental task because that means convincing very busy PMs who have a lot on their plate and dont want to take risks on things they don't understand, and now it's like "can we scale this out" and you have a very nicely formatted proposal and POC. It de-risks things very quickly
You still want someone whose ass is on the line if they get it wrong.
Say I want to build a feature in a product.
- DS has to do a deep dive (need buy in) to opportunity size and derisk with data. That DS has to work with other DS (people may have left or moved teams) to figure out how to get the right data and figure out what the difference is between 10 different tables that have overlapping but inconsistent data. - Eng has to build up an actual simple demo (need buy in) - Design has to make it not hideous (need buy in) - Legal has to review what you're doing; POCs should involve real data where possible because otherwise no one will trust it, even if its just for user analysis on existing products
This plus about 6 internal system bugs for custom tools that are flaky and who's team has long been re-orged or laid off, 8 people who won't answer you, 2 PTO's for the stakeholders, 6 weekly meetings
no one did POCs, they just had ideas and tried to get PM's to put it on the roadmap so if it fell through at least it was bought into
If they can crack that latter review/spec-check/assurance step, checking that what was built was what was demanded of the problem such that we don't have humans in the loop at that step either, then the bottleneck moves again. Then I think it moves to requirements capture and to product development, but that might depend on the industry.
Kubernetes is at 11 years ago, and is huge enough to be included there. The Google Pixel was just under 10 years ago. So... not nothing haha
The problem is they get killed by some other executive who is afraid of their department looking bad by comparison.
I think this is fairly illustrative of the challenges in AI becoming as impactful as the Internet. The bottleneck is not making things. There are plenty of people who are really good at making things and can easily be 10x or 100x as productive as the average corporate worker. YCombinator was founded on that premise - small teams of founders and early employees could be orders of magnitudes more productive than the 1000s of corporate employees at their competitors.
The bottleneck is on bringing your product to market. If your innovative new product is built within a corporate environment, it'll get killed unless the executive you work under can get a promotion out of it, and you'll be denied all sorts of help with approvals, launch process, PR, marketing, branding, etc. If it's a startup, they'll try to shut you out with exclusive distribution deals, legal threats, lobbying efforts to change the legal environment, PR campaigns, FUD, etc.
The Internet was revolutionary because it let millions of people bring products to market without asking permission. Instead of having to bid for retail shelf space among dozens of entrenched competitors that all had sweetheart deals with the retailer, you could just put up a website and sell it to anyone across the globe. Instead of following hundreds of regulations that governed existing commerce, you could just launch something and sort it out later. AI doesn't really have that property - if anything, it makes things more centralized, with more gatekeepers, and so seems more likely to destroy economic value than add to it.
> YCombinator was founded on that premise - small teams of founders and early employees could be orders of magnitudes more productive than the 1000s of corporate employees at their competitors.
I think this is still true, but the theory is:
1. You don't need YC-type funding to do YC-type business any more; 2. You don't need to scale the business past those small teams any more, you just buy more tokens.
For clarity YC still obviously has a place as an incubator, mentoring, and networking function. I just think that what was previously the inevitable conclusion that you have to hire all the people the second you hit PMF to keep up with scaling the business as you scale sales is no longer inevitable. If you didn't want to go that way before AI, you were a "lifestyle business" and not worth investing in. As more and more knowledge functions get capably implemented by AI, it's the preferred position: humans are vastly more expensive than tokens, so you want them doing the stuff the AI still can't do.
I don't think this necessarily translates to mass unemployment. I think it translates to masses of smaller businesses that are radically more efficient because the handoffs between business functions are tool calls, not emails to someone who doesn't want to help.
> The Internet was revolutionary because it let millions of people bring products to market without asking permission.
Think about it this way: if I am a small business owner but I think it makes sense to do something that previously only a team in a corporate environment could do but is now within the reach of AI, not only can I do it now, but I also don't have to ask anyone for permission! Who wins between the corporation and the small business in that scenario?
> AI doesn't really have that property - if anything, it makes things more centralized, with more gatekeepers, and so seems more likely to destroy economic value than add to it.
I think this will turn out to be backwards. I can see a version of this where the number of things you can do without needing to turn to a gatekeeper for help increases to the extent that the balance completely inverts.
The vast majority of businesses are small, and AI can give them tools which previously required corporate scale to make sense, without the inefficient hand-offs between busy, political humans. Which is also something that the internet did! Getting an advert in front of a national market pre-internet was Hard but sometimes you had to do it because your target market was "all Canadians who buy toothpaste" or whatever and that meant saturation-bombing the physical environment with physical billboard ads, posters, flyers, and so on. So you only did it if you were P&G-scale. Now you, personally, can do it, trivially, for better or worse.
> Valve
Arguably a monopoly. They've got a product that sells itself with very low infra overheads for the income.
> Hedge funds
Very different model. I don't think the same intuitions apply.
I would agree but it's really minimized the building. More and more time is being spent on pre-coding work.
You'll find that most internal "innovation" teams are just lip service. In most cases, the "mothership" will be incapable of reproducing true innovation -- from a statistical perspective, culture perspective (mega corps are anti-scrappy; internal politics), and motivation perspective (startups aren't 9-to-5). It's much easier to have big M&A budgets, a VC arm, and some handwavvy internal innovation group.
Every now and again, you'll get real innovations (Waymo, transistors, GUIs), but even those have a spotty track record of commercialization when created internally.
I suspect that AI will fail to pan out to the same extent for the same reason why outsourcing hasn't fully panned out (even though every company tries it after getting big enough).
The problems that will come up will be and always have been ongoing maintenance. AI is great at writing new code without a brain behind it, but once you get to the point where you need to refactor code, you start really needing someone with coding experience to guide the AI or veto it's mistakes.
I don't think that's really fixable even with a lot better AI. It's not something that ultimately comes out of the likes of github data.
I'm not saying that AI isn't going to make things better, btw, I just don't think we'll see a 20x improvement. Probably more like 1.5 or 2x.
The determinant of success was only whether the task needed American-tier labor or could make do with sub-American quality labor.
Basically every big tech has large offices and employ a lot of people there.
The limitation is that Ireland is a relatively small country, and most Irish developers are already employed (which is why Ireland end up being one of the main destinations for tech workers being hired from abroad).
That part of dev work, the requirements gathering, attention to details, clarifying requirements, is something AI also struggles with. A lot of companies basically waste time and money on outsourced devs because without a clear path forward they effectively will sit and do nothing, waiting for a prompt.
How I find your argument is that one distinguished engineer from US could do the same with the use of AI.
I worked with both and I know great and bad engineers from both sides. Only thing is that US has a bigger pool of great engineers.
My mental model for that is that outsourcing fails where the work is being done organisationally far from the knowledge needed to do it. We know that's true of teams inside organisations, there's been a lot of research on how distance in the organisational tree negatively impacts productivity. Outsourcing is a pathological worst-case of that.
The promise (promise! We're not there yet!) of AI is that I can have a cross-functional team on my laptop. Organisational distance is zero. Where previously the outsourced team has to wait for the time zones to roll round so I can answer their blocking question when I get to my email STRICTLY AFTER I have had my coffee, now it's a prompt in a chat window with a button I can click to make a choice in 5 seconds. Delay is gone, cost of delay is gone.
> The problems that will come up will be and always have been ongoing maintenance. AI is great at writing new code without a brain behind it, but once you get to the point where you need to refactor code, you start really needing someone with coding experience to guide the AI or veto it's mistakes.
Oh, absolutely. That's a minefield. Today. It will be, right up until it isn't. There are ways to set up agents and projects right now that make a dramatic difference to how this part of the picture plays out, but those will sink into the harnesses as time goes on.
But also the big problem with maintenance and outsourced teams tends to be the commercial structure around the contract. You get a Build team, who Build the Thing and then: no more features for you, anything you want to add past the original spec costs extra. They hand over to the Run And Maintain team, who get to fix all the bugs that the Build team left but without the knowledge gained from building the thing, but are scaled and located to be absolutely as cheap as the supplier can get away with so probably don't have the skill, inclination, motivation, or permission to take on any restructuring to make the bug fixing easier and they're on the wrong end of the globe so there's a 24-hour latency on any queries. It's a terrible way to set teams up, but it looks good on paper.
Again, that's peculiar to outsourcing and completely goes away if I have the same team that built the thing own the thing long-term. That's true if it's humans or AI!
> I don't think that's really fixable even with a lot better AI. It's not something that ultimately comes out of the likes of github data.
No, it's a harness problem. You need to start from a maintainable point and keep standards in place. It'll take work to get the harnesses there and it's not ubiquitous. You might also need better models, but I've already personally seen big differences in outcomes between projects that took certain steps and others that didn't; it's nothing revolutionary, mostly stuff that works for humans also works for AIs but you need to know to ask for it.
> I'm not saying that AI isn't going to make things better, btw, I just don't think we'll see a 20x improvement. Probably more like 1.5 or 2x.
I think people radically underestimate the cost of delay. I don't know if 20x is realistic for the AI itself, but I think it's not impossible once the inefficiencies of having to go to other humans is factored in.
It sounds like the economy would largely reduce to the small minority class of independently wealthy people.
It takes a skilled knowledge worker to use these things.
And worse, these are the tasks that help the junior people eventually grow into the skilled knowledge workers required to operate models, so there's a pipeline problem too.
I'm already seeing tech execs/hiring managers getting very frustrated at the lack of new-senior-engineers to hire. The market will correct for this in time.
To follow on from that comment, if the growth in breadth of capacity of AI leads to a decrease in the risk of running a smaller business, which I don't think is an unreasonable prediction, then it's not inevitable people do lose their jobs. Employers get smaller, higher-leverage, and more plentiful.
Not completely, but compared to the middle ages we 50x'd their output. Which is a great illustration what it means to make a job 50 times more productive. We went from 80-90% of the population being required to barely make enough food for everyone to survive, to 4% of the population producing such an abundance that consuming too much food has become a systemic health issue
I'm pretty skeptical on the outcomes and the costs also (natural and social as well), but possibly we can have 50x or even more software in the end! The phrase will be truer than ever:
> Software is eating the world!
https://www.agtechmarket.net/news/laserweeding (random web search, I don't vouch for this site, it just looks legit at a glance)
Next innovation could be to scale succession planting, which keeps the ground from being exposed in between crops and lets you transition from nitrogen fixers to users quicker, getting more food out per acre while reducing fertilizer usage. But you can't do that with current harvesters and human labor is too valuable to spend on this.
Also take broccoli harvesting, typically you get a few big heads, then it keeps producing smaller heads, but it's not economical to harvest the smaller heads with human labor. Robotic harvesting lets the same plant produce more food per acre and uses the energy needed for new plants instead to keep producing food.
This is where this is going, the whole industrialism is totally self-serving, and for every problem its answer is digging deeper in the rabbit hole, and creating 2 more problems in addition to solving the initial problem only half-way.
I don't want to say what you are suggesting is not possibly useful, I just want to emphasize how stuff works out in reality, in addition to doing some nice stuff like what you called out (the halfway solution to the problems). All we get is more alienation and humans getting depressed and feeling a lack of purpose... but somehow we cannot afford to pay fair prices for the agricultural work, and pay fair prices for the food, and not overproduce and overpollute... and the same thing is happening in every aspect of the human condition, not only food production, which is the most basic and ancient activity we have been doing.
They do not care unless these companies can get a bailout.
UBI only exists for companies that are too big to fail. Case in point, 2008 and SVB when there was too much money on the line.
One of the AI companies attempted to guarantee themselves a way for the government to bail them out if they were close to defaulting on the debt from the data center build out.
Arguably, the main impact of securing SVB depositors above the $250k limit is that it prevented thousands of people from being laid off that week, as their employers wouldn't have had the money to make payroll the following Wednesday.
Sure, but is that the case now? Is everyone made whole when a bank fails and they have more deposits than the insurance limits? Or only when it's the well-connected / too-big-to-fail?
Looks like the answer is no: https://www.wsj.com/finance/banking/a-small-banks-failure-le...
So I don't think it's unreasonable to describe SVB as a bailout. Not for the investors, but for the depositors. Has anything changed to reduce the moral hazard / make it less likely to recur?
What makes you think the people who used to build (or would have built) software will switch into the industry of "knowing that the thing was the right thing to build", as opposed to something cooler like surgery, city planning or experimental physics? The roles within a tech company are not the only jobs in the world.
I don't.
> What makes you think the people who used to build (or would have built) software will switch into the industry of "knowing that the thing was the right thing to build", as opposed to something cooler like surgery, city planning or experimental physics?
Because it's probably already part of the job. It's a change of emphasis, not a change of career. Your boss can already ask you to do it. If you're producing code, you're probably also reviewing code, checking it matches the acceptance criteria, testing it, sanity checking that it was the right code to have been written, today.
“There’s more capital than good ideas to fund” has been a complaint from the likes of A16z & other VCs for a long time now. It’s why we ended up with stuff like NFTs getting funded.
I am rather more concerned about competition from CHINA. With how Huawei (2000 -> 2020) crushed every other telecom company and went from nobody to the most revered leader in 20 years, and with the depth of leadership in manufacturing and work culture, if China surpasses USA in AI, all US companies lose.
No one I know feels richer than they did a decade back. I've not been able to meaningfully put up my prices for a decade. People are tired and stressed and scared, particularly scared of a technology everyone keeps telling them will make them redundant.
There is no rising tide lifting all boats, just most of us drowning whilst a few whizz past in their yachts.
I honestly hope these guys faceplant ASAP. Couldn't happen to a nicer bunch of people.
Consumption has risen, inflation adjusted wages have risen for blue collar and white collar alike. Most social mobility has been the middle class moving into the upper middle class, not moving to the lower class.
The main thing holding people back is the housing crisis. This is orthogonal to the value creation of businesses.
Value creation is growth. If it didn’t exist the S&P would still be 42.55$.
This feels wholly at odds with saying most social mobility is upwards. So most of the social movement is into a class where a home and vacations are a given, but we also have a growing class of people who can't afford a home? Per BLS, average real wages are down 0.3% YoY https://www.bls.gov/news.release/realer.nr0.htm .
> Value creation is growth. If it didn’t exist the S&P would still be 42.55$.
This reductively assumes "value creation" is the only effect on the S&P pricing. You'll note a ton of graphs correlate with it, e.g. https://tradingeconomics.com/united-states/inflation-cpi is the US inflation rate, which also tracks the S&P pricing. Ie if a company is worth $100 a year ago and inflation was 4%, I'd expect to pay $104 for their stock with 0 value creation whatsoever.
Oh the lost irony.
My wages haven't risen for nearly 5 years, while inflation has occurred over the past 5 years. Why the blanket statements?
> The main thing holding people back is the housing crisis. This is orthogonal to the value creation of businesses.
Are you suggesting a "housing crisis," in your words, wouldn't impact consumption? I'm watching my spending (and living like a child in his parent's house, except it's not my parent and I have to pay for it) in the hopes that in about a decade, I'll have saved up enough of a down payment for a home somewhere in my state that I could actually afford the mortgage on the remaining amount. There are plenty of things I'd potentially spend money on but won't as long as I feel like I'm economically stuck and have a chance in hell of saving my way out of it. So this feeling translates to fact.
If you think my personal experience is just an anecdote and doesn't count because it's not being told through the lens of large-scale numbers, fine. But I really agree with the person you replied to that you're gonna have to be a whole lot more specific than "value creation" if you want people to spend money on your AI products "in this economy," whether it's because they're actually strapped for cash or just pretending like you seem to think they are.
It's kind of become socially taboo to not be suffering "in this economy", but on paper it's hard to see weakness in places that there isn't always weakness. As long as the 65-95% are doing well, there isn't going to be a collapse.
From the U Michigan page: https://www.sca.isr.umich.edu/
or from the FED https://fred.stlouisfed.org/series/UMCSENT
In most of Europe individuals at least don't need any of that. I'm in France and it's just a connection to a government run website to enter a few figures, takes less than an hour most of it is already pre-entered (salary etc), the main thing to add manually is charitable donations.
If you're running a business then yes an accountant can be good (or be required depending on the legal form of the business) but not for individuals.
Otherwise normally costs around $800 to do, because I have a small business too.
Are you? Does it cost you extra (time or money) to be?
True, but I think the GP's point was that what consumers will pay won't be nearly as profitable as what enterprises will pay to increase the output of their developers and knowledge workers. ChatGPT is currently the overwhelming leader in consumer AI usage but only ~5% pay $20/mo.
As a recently retired serial tech founder, I'm now one of those consumers. I use AI webchat daily for general search, Q&A and even to write little automation scripts for myself, yet I haven't paid anyone anything for AI yet. Even after being heavily restricted and performance nerfed to hell in recent months, free webchat AI is still fine for everything I do, and I'm not remotely price sensitive.
Even as AI compute costs fall over time, I doubt serving ads against AI webchat to consumers will generate the kind of high-margin, sustainable growth VCs get excited about. It's so undifferentiated I bounce around between all four leading providers because there's virtually no moat locking casual consumers to any chatbot beyond a single question thread. I guess if it had a nearly infinite context window seamlessly integrated across all sessions, that might be somewhat sticky for some consumers but it could also get creepy for some others - and it would devour gobs of the scarcest resource in AI. Beyond Maslow's Hierarchy of Needs, the mobile phone is the largest revenue, long-term mass consumer product ever but I just got a new flagship phone from a top-tier provider for $30/mo over 3 yrs. IMHO, even an all-you-can-eat, infinite context window, next-gen Mythos couldn't reach and sustain mobile phone levels of global consumer adoption at ~$20/mo. Unlike professional developers and knowledge workers, consumers don't have any "job to be done" big enough for an LLM to command that much of their zero-sum discretionary spend.
That will certainly help but it doesn't move the fundamental limit because resource efficiency is a cost driver not a demand driver - and my argument is against the thesis that lying beyond professional devs and knowledge workers, there's an untapped trillion dollar industry serving LLMs to mass global consumers.
Using Simon's cost estimates, I agree that halving the current $1,000 - $1,200/mo MSRP to profitably serve frontier inference to professional developers and knowledge workers (PD&K) will help Vendor A steal share from Vendor B or C. It will also increase LLM sales penetration into the segments of the global PD&K TAM which can't afford ~$1K/mo for every seat. A fair chunk of the PD&K workers in many SMEs aren't included in today's ~$1K/mo per seat license pool, especially in 2nd and 3rd world geos. When the price falls to $500 and $250 most will but that's still just saturating the existing PD&K TAM - not pushing into mass consumers.
While the PD&K TAM is big, justifying Trillion+ dollar capex spend requires believing the TAM is much more than PD&K and eventually grows into converting a couple billion non-PD&K consumers into ~$20/mo subscribers. I don't buy it for two reasons:
1) The Comps: There are vanishingly few examples of long-term, mass consumer adoption of a discretionary technology at that scale. Mobile phones at ~$15 to $30/mo are the obvious one but LLMs are nowhere near being that valuable to the average plumber in Des Moines, baker in Jakarta or retired nurse in Hamburg. Pondering it, I just imagined forcing any of those people to choose between their mobile phone and an LLM chatbot. Sure, some who are flush with cash might choose both but for most consumers in the world ~$20/mo is big enough they'd have to pick one and ~zero percent would choose the LLM over their phone. After mobile phones, the second comp for discretionary tech spend I thought about was XBox and Playstation monthly gaming subscriptions but combined they have less than 90M paying subscribers and the ARR is just under $10/mo. As an industry, "Big LLM" is spending well over a trillion dollars every five years. XBox and PS ARR doesn't even cover paying the interest on that capital, much less the 3 to 5x returns hedge fund investors are betting on.
2) The Alternative: It's useful to doubt my own intuitions and one counter to my skepticism is to assume "But LLMs aren't finished yet, they're going to get much better." How much better could an LLM which can be profitable at ~$20/mo get than Claude Mythos in the next five years? Instead of debating future unknowables with myself, I've found it's better to just imagine the most perfect future product I can that's still realistically plausible. So, let's imagine we're willing to spend a million dollars a month to very unprofitably deploy a prototype to test the consumer demand for "Tomorrow's Awesomest $20/mo LLM" today. So we gather a few hundred super smart, broadly knowledgeable intellectuals together at one top-tier university research library, where they'll have access to every commercial database and unlimited Claude Mythos 2.0 and ChatGPT 6.0. Since our experimental budget is $1M/mo we can afford to add in several Nobel prize and Fields Medal winners too. They'll work together manually reviewing and improving not only every LLM answer but also our test user's prompts - and of course our test chatbot will have human-level real-time speech recognition and vision (via Zoom and screen-sharing with actual genius-level humans), making this truly a test of the "smartest, most accurate, best consumer chatbot" we can imagine.
Now, let's run the test by having one thousand mass consumers try it out and see how many Des Moines plumbers, Jakarta bakers and Hamburg retired nurses we can convert to a 1 year @ $20/mo subscription for our $1M/mo ultimate chatbot simulation. Playing this thought experiment out in a bunch of ways, I find some percentage of outliers, iconoclasts and closet intellectuals would go for it but... the vast majority just don't find it enough better than "free" chatbot alternatives AT&T includes with their phone subscription or Samsung bundles with Galaxy phones - despite only being ChatGPT 5.4-level. It turns out, most plumbers, bakers and ex-nurses don't have a compelling "job to be done" in their daily lives that even an MoE panel of actual Nobel and Field's medalists with ivy league professors can make enough more valuable than an inferior but free-to-me chatbot, in the judgement of our Des Moines plumber. While the world's smartest chatbot is nice, when it comes time to pay, he prefers having one additional premium football match on TV and a six pack of cold beers every month.
A recent post here said AI spend could be "20% of every software developer's salary"... and that seemed plausible based on productivity improvements. That's not about a phone bill.
What sort of new value, and why will people pay for it from someone else rather than prompting for it themselves?
The AI might very well be used by noticeable % of population daily, but that doesn't mean they will be paying trillion dollars to the leading US AI companies
Just realized something: if one worries about losing jobs to AI, token's high unit cost is good news. To say the least, high cost would delay the displacement, if any, right?
In the meantime, someone shared the below on X. I guess the moral of the story is that "good enough" does not just displace software engineers, but also models.
> I Went From $3,000/Month on Claude to $5/Week on DeepSeek
> And honestly?80% of my work is identical.
> For the past two months, I was burning $3-5K monthly on Claude Code. Every idea from design to development to testing - full end-to-end automation, even simulating users to test my products and provide feedback.
> Extremely token-intensive. But Claude's caching sucked, making it insanely expensive.
> Then I discovered DeepSeek V4.But the point is that if people are willing to delegate part of their salary (e.g., buy consumer products), vs requiring employers to pay for the tokens, then it's quite possibly a net win. Something like "I pay a largeish fee every month to make my own job much easier", similarly to how we buy a car to make commuting easier.
They are assuming ~10% global GDP growth instead of ~3%. You probably don't need the same %s if the pie grows a ton.
I'm highly skeptical we get that growth, but if you aren't, it makes it easier to digest.
The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.
Net effect of this productivity increase: less consumption by the masses, even though you may be producing more good and much more efficiently.
A third effect also comes into play that once all this starts to happen, common people, who are generally living paycheck to paycheck, will now start to hesitate towards making any long term investment, housing included. And that indirectly will end up impacting financial and banking sector, which will then impact existing savings, bonds yields and retirement funds, and the recession-like cycle starts.
This productivity increase only makes sense if it is capped to a very small number.. like 20% max. Beyond that, who these companies will even be selling to?
Am I overthinking all this?
That only holds if companies have a fixed need for "productivity" which is met by their current employees, such that their employees becoming more productive means they need less of them.
Every company I've ever worked for has wanted to achieve way more than they are able to get done with current resources.
But generally yes, the biggest open question about all of this is how the impact will play out on the economy, job opportunities etc. I've not seen anyone come close to a confident prediction about how this will play out.
I mean sure. Every company wants an infinite addressable market. But that doesn't mean it exists.
It might not be possible to sell 10x the software we sell today. It might not even be possible to sell 2x
>Net effect of this productivity increase: less consumption by the masses, even though you may be producing more good and much more efficiently.
Big tech companies can't even create login flows and account recovery flows that work for everyone yet. There are countless stories of folks losing access to business Instagram accounts that get hacked, Google support from a human to fix a problem that is outside of their help articles is non-existent, etc etc. There's still so much "low-hanging fruit" IMO that isn't particularly fun or exciting to fix, but ask your average non-tech friend or family member what they think of the Facebook + Instagram security settings pages / sites / desktop-only settings.
Who is going to pay for all of these subscriptions that will power this GDP increase when average purchasing power of those outside of the top ~10% of earners is decreasing YoY? We're headed toward food and water shortages next to sprawling datacenters, not shared societal prosperity and a healthy middle class.
Nope, if AI were to realise the hype, you have to take into account macroeconomics. Usually this isn't a problem for most businesses
>The more AI causes productivity increases, the less and less number of workers will be needed. This will heat up the job market even more and bring salaries down.
People also underestimate that the reason why companies are so excited about AI isn't to increase productivity, its to fire workers and crack down on worker rights. They won't lay people off because AI means they don't need as many people to get the job done, they'll fire everyone while doing a much shittier job, because they hate having to abide by worker's rights and pay people
Secondarily, reducing the cost of making a thing doesn't always mean you get less of a thing. For me, certainly, what happened is that I write way more software than I originally did. When we built compilers, the amount of human engineering effort required to do things plunged, but the amount of software engineering jobs didn't go down.
This is as bad as models will ever be. That part is true. And it's entirely possible we go foom. But it's also possible we don't, and then it depends on where the asymptote lands.
0: https://www.slowboring.com/p/this-economic-myth-needs-to-go-...
Why does this have to be the case with AI but it didn't have to be (and wasn't) the case with the steam engine, electricity, the automobile, or the computer & internet?
Certainly, AI could be different.
It's curious to me why the vast majority of people on here think it must be different.
Some people take the view that AI could make knowledge work largely irrelevant. Any niche humans could carve for themselves would only live long enough to generate training data for the AI to automate.
source: https://isaiprofitable.com/
If John Ternus wants to spend some money, spend it on bringing memory in house. Apple has the money and the engineering talent to do so, have it fab/made onshore in partnership with TSMC.
Do it Apple because you have to not because you want to the Chinese probably will be taking over the memory industry, worldwide, by taking advantage of the greed from three memory companies and their AI overlords.
2. The companies themselves buying tokens for operations to make the work more efficent. e.g. Salesforce agent or Microsoft Office agent or random saas inventory agent. (and if you say those will go away (which I don't believe), it's even more bullish. The tokens just go to someone vibe coding XYZ, which is EVEN MORE than if you were to buy saas because it's SaaS product x Companies that built it instead of just one)
3. The companies SELLING tokens. This is also new markets like schools and small business (e.g. the local gas station buying an inventory tool)
4. The consumers "buying" (I put in quotes because it can be subsidised but the company) through chatgpt, strava, instagram/netflix recommendation, etc.
Local models still take compute, and while it may be cheaper, it is the same argument of on prem vs cloud. No one operates on prem unless you HAVE to for regulatory. Margins will come down and you just spin up a GCP/OpenAI/Anthropic agent.
It may be "cheaper" but rationally its better to pay someone to manage it. Thats why Hetzner only had $367M in revneue (a lot but tiny compared to managed services)
Privacy is also a huge issue.
The scale of these investments put the lenders at substantial risk, so the lenders will do anything to make it work. If the current lenders will be damaged by extended payback periods, they can simply sell the debt to someone else who won't be.
https://github.com/danielmiessler/Substrate/blob/main/Data/K...
Knowledge worker compensation is 35 - 50 trillion a year globally (6 - 12T in the US alone.) That's a huge TAM. It's still close but 5T over 5 years seems doable.
>... unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.
The way we make ICs 10x productive is not just making each of them individually more productive, but by removing the coordination overhead of large organizations, because overhead scales super-linearly with the size of the org. And orgs will shrink automatically as AI-assisted ICs take ownership of larger and larger scopes of work, leaving much more budget for tokens.
I went into this in a bit more detail along with some made-up numbers here: https://news.ycombinator.com/item?id=48040999
2. Where does this $5T number come from? If they make $4T in revenue over the next 5 years instead, what happens?
What I'm often hearing though is the equivalent of "gg ez" when I bring that up. I don't understand how this will at any point blitz scale to profitability. As far as I know they don't have positive cash flow, no one has a moat and I don't think they will push out engineers.
But, at that point I think the big players’ moats will have dried up. Local models will probably be sufficient for 99% of daily office worker tasks.
So I disagree with TFA’s premise. I think this fear is probably shared amongst the LLM giants, and they’re still hoping that neural network transformers are somehow the path to AGI (probably not, imo).
Your scope is too narrow. The companies target more than white-collar jobs. And $1t is around 0.5% of the world economy.
What’s their moat? Is it hoping for regulatory capture where scraping is made illegal the day after they finally finish scraping all human language?
It’s like OpenAI dammed the Colorado, and Anthropic dammed the Hudson, and now they’re both trying to sell us bottled water subscriptions at $100 a month. I don’t know how well the dam part of the analogy holds up, but the water part feels strong. Compiling models based on humanity’s written output feels like something no corporation should own.
Anthropic Max: $100/month
OpenAI Pro: $100/month
Total paid: $200/month
API equivalent usage: $2,180.16 in 30 days
So paid only 9.17% of API-priced value a 90.83% discount, or about $10.90 of API priced usage for every $1 paid...
That proves heavy usage but not sustainable unit economics.
Anthropic reported numbers point the same way:
Q2 revenue: $10.9B
Adjusted operating profit: $559M
Margin: 5.1%
SpaceX compute: $1.25B/month = $3.75B/quarter
So one compute supplier alone equals 34.4% of quarterly revenue and 6.7x quarterly adjusted operating profit.
Its difficult for the blogger to understand something when its incentives depend on not understanding it...
My usage is therefore a useful indicator of quite how much those enterprise companies may be spending on tokens, given the new pricing scheme.
If enterprise companies were still getting the same discounts that I get myself I would not have written this article.
(I had to dig into your margin figure - looks like you calculated 5.1% as 559000000 / 10900000000 * 100 but that $559M "adjusted operating profit" figure includes training costs, where usually when we talk about margin on inference we're not including those since those costs are fixed, margin calculations make more sense against the variable costs of serving a token.)
Let's put it context. Google's annual revenue seems to be north of $400B. So if OpenAI suddenly had Google's revenue, it would still be insufficient to recover their investment.
and it's a ticking time bomb because $1T in servers, CPUs, GPUs and memory is going to be worth $200B in 5 years. You can say they can keep using what they've got. Sure. But they're also not going to stop spending on new hardware. And the competitor that comes along in 5 years and spends $1T doing the exact same thing is going to have a huge advantage.
OpenAI at this point reminds me very much of the Russ Henneman pre-money hype cycle.
Data centers come down to performance-per-Watt. Electricity accounts for 20-30% of a data center's operating cost [1]. I don't know the exact breakdown but the GPU part of that is probably the majority given how power hungry GPUs are. The B200 is upwards of 1200 Watts [2]. The B200 is rated at ~4.5PFLOPS of dense FP8. So you're getting 3.75PFLOPS/W. We don't know what the next generation will look like. The A200 (Hopper architecture card that preceded the B200) had ~4PFLOPS apparently but also lower power consumption. Obviously this changes depending on whether you're looking at dense or spare and FP8 vs INT8 vs INT4 vs FP4, etc so we're just using FP8 as a yardstick.
Imagine a fictional B200 successor, the T200 that has 8PFLOPS of dense FP8 at 1000 Watts. Well then a DC built on that where the T200 will likely cost similar to what the B200 does now, you'll get nearly double PPW so the same size DC and same electricity load is going to be like 2 of your old DCs in operating costs. That's a big deal when you've laid out a trillion dollars.
[1]: https://iaeimagazine.org/electrical-fundamentals/how-much-el...
[2]: https://www.trgdatacenters.com/resource/h200-power-consumpti...
You have either never seen a tech cycle, or need to be reminded of that. The pressure to buy more expensive plans is already starting to form.
It seems quite possible to me that developer tooling is going to end up being the biggest win from LLMs because there is a product-market fit -- and also quite possible that OpenAI and/or Anthropic end up getting bought for pennies on the dollar because their burn rate is unsustainable. AI may end up being this generation's "dark fiber."
and in that sense, if Anthropic and OpenAI are able to create the projection that they can-be profitable despite finances seeming bubbly at best, I think that what happens is that these companies spew so much amount of content that people like Simon get into it too.
There is a deeper problem of people falling into AI psychosis too, in general, I am not sure if Simon has fallen into it or not
I think that the greatest point which can be made here is to not offload your thinking to others and to think about the situation yourself. Sounds familiar (looks like we are all off-loading our thinking itself to machines)
Side-note: As humans, we have a tendency to quickly judge or make quick decisions which stems from our times foraging and scavenging in jungles.
Another Side-note: at a certain point, I am unsure of how much to think about AI or not, certainly discussions about it that were happening 2 years ago weren't helpful in contexts that they are used now (well not in any way or form that a person discussing and getting into the weeds of AI 2 years ago is better than a person just getting into it say 2-3 months ago)
With the industry (moving so fast) [but that doesn't mean that you can't catch up with it, I feel like the fast word has made people think that they are falling behind which is imo wrong i suppose]*, It is basically unsure to me of any FOMO or anything if you aren't using AI already, I find this notion naive.
People might be making strong opinions (AI psychosis) and skills on the tools available at the moment the same done 2 years ago. We don't quite know about the tech as these are still black-boxes and how they progress and what these "AI skills" might survive or not in future. Heck, we aren't even sure if these tools might survive or not or wouldn't be made magnitudes more expensive simply to break even as they are given to us for the first time at percentages of the price.
I don't know if I should form (strong) opinions yet and also a question of its worth so much thinking efforts in the first place, probably just gonna do my own thing (the way I want to) which includes learning C at the moment. because learning is fun.
My question which I wish to ask: What would happen to these AI companies if they turn out to be anything but wildly successful companies, both to the investors who have already invested in it and to those who might be investing indirectly into it in the near-future (passive investors, retirement funds)
I would love to hear your thoughts on it!
Thanks and have a nice day :-D
I'm not nearly enough of an economist / finance person to answer that credibly, but I expect they'll go bust, and a lot of people will lose their shirts.
... and the model weights will be sold to other companies who will then run them at a profit, and eventually figure out an economically sustainable way to train new ones.
The 1800s railway booms are a good comparison here - a lot of companies went bust, a lot of investors lost money, and we still ended up with railways.
If the AI companies all go bust we're going to have a lot of spare data center capacity!
I can be wrong I usually am but an AI DC != compute DC or that it might decrease the prices of servers substantially because of it. (well not exactly, I hope you read my whole message so that I am able to better explain what I am saying.). AI DC's try to optimize for one thing: running GPU's for immense scalability and flexibility (0 to numbers>=large_number).
Currently, its actually way worse, the server providers are some of the worst impacted by the industry at the moment because each server requires ram and ram is well... increasing in its price exponentially. It's really a tough time to be a provider at this time (in certain respects) directly because of AI.
It is unclear to me if spare DC capacity will have any meaningful impact to it. I don't think that atleast within compute (and not GPU/AI DC), that space was too large of a problem.
Fun fact but one of the largest providers (BuyVM) had its datacenter price from where they colo'd increase because of the immense demand at the moment for spots in datacenters by many tens of thousands of dollars that they did the first price hike in at this point at decades! The situation is this dire :-(
Ram prices might come falling down and DC's might get cheaper but they can only get cheaper to limit, they still need to for example DC security employees
and I wish to suggest that if anything, investors might wish to re-coup their losses within the AI loss, they might want to make up with what little they might have (ahem DC)
For example, if you wish to want to take at an even more egregious example of what I am suggesting, there are many new york LLC's who would much rather leave the properties that they own empty rather than decreasing the price of what it costs (which they have set to some egregious amounts). I think that for them, somehow the math ends up working out in the end somehow, so there might be something more to it.
I wish I was optimist but I don't believe that the gains in spare data center capacity are worth even a fraction of fraction of the damage if AI were to go bust as you suggested with trillions of dollars vanished.
So, with the data I have at the moment, I am unable to suggest that compute would be cheaper. Heck, it was cheaper before AI and compute prices have never been something that people worry about because there are sometimes 10x cheaper options than AWS,GCP,Azure with things like Hetzner/OVH and others (yes its not a 1:1 situation but still its a 95% overlap and for all intents and purposes, great)
I can see a potential where GPU compute can get cheaper, oh boy, its so much more expensive than compute but I feel like GPU's aside from AI might still have a much more limited niche than generic CPU.
The issue wasn't ever the pricing. Simon, I own 7$/yr vps's which run my websites fine because they are written in golang. I doubt it can get cheaper than it. (You can get a 3$/yr vps if that is what you are interested with using Nat VPS + cf tunnels)
I would once again appreciate to hear your thoughts on it. The only thing I realistically see is if Ram producers ramp up their productions and create a ram price glut in the next few years, but imo the prices would even out over the long term.
I have seen the point of spare DC capacity being raised up multiple times but I finally ended up writing a message which hopefully captures the nuance, but once again, I don't know the future about it.
Waiting for your reply and have a nice day Simon (& other readers) and thanks for reading if you did, I appreciate it :-D
This is where the napkin math is breaking down in a big way. There is absolutely no reason to assume this will only impact "knowledge workers". Farmers use computers. Farmers will use AI.
(I'm not trying to imply that LLMs can replace software engineers, it's just an interesting comparison. If nothing else, I suspect that if the cost of development goes down, demand for custom software will go up.)
If it works. And I’m not sure who is going to buy the stuff the machines produce, but shrug. Presumably some bots click ads for NFT’s that other bots generate.
So besides the insane hardware buildouts you're correctly mentioning, I don't understand how anyone that invests in these companies is supposed to make their money back in any sort of reasonable timeframe?
The cynical part of me is looking at what happened to the NASDAQ rules recently where essentially index funds are going to be forced to buy SpaceX shares much earlier than they previously would have (ie, before the price has a chance to reach it's real valuation). Which, um, I'm guessing these stocks are going to drop pretty hard when people start looking at the financials of these companies.
My suspicion is that the point of these IPOs is essentially to dump the bill on the unwilling public by forcing various institutions to buy it (ie, your 401k or pension is buying this shit), and maybe their investors can squeeze some money out of this before the stocks reach an equilibrium that's probably like 1/10th of what they're "valued" at.
I am willing to bet a Twix we'll look back on that stuff in 2 years with a lot of embarrassment
What I do not understand is: large sectors of the economy all simultaneously taking this punt, with the necessary productivity boost, as you say, far more like: 2x, 5x, 10x
> We're talking about a world where you need 5% of every knowledge workers salary to go into tokens. 20% if you're a developer.
with that much money, the companies can easily buy their own hardware and hosting free public models, no need for those expensive subscriptions.
It's much like when developers would waste tons of money on AWS spinning up massive test VMs and leaving them running without care. Until the finance people cracked down on it.
Chapter 11 is not Chapter 7. Businesses survive chapter 11 bankruptcies all the time. For example, WeWork.
Uber was basically only ever software to help people use their own cars so a very small part of their valuation was physical stuff to upkeep, it was just deals and obligations they had.
Not sure how it shakes out for Anthropic and OpenAI. There’s a lot of physical capacity that needs to be built out and can depreciate. But there’s also a lot of network effects and dependencies being built in with enterprise users.
I don’t know how swappable the tooling is either. I think over the long term the UI, model training and documentation, and infrastructure are going to end up being run by different parties and I’m not sure which leg of that chain ends up in a position to skim most of the profit off. My guess is that Apple and Google end up raking in all the money since they control the OS and app stores while the rest of the stack gets driven down to being generic commodities. At least where mass market consumer adoption is concerned.
> But then you sometimes go and talk to your senior engineering leaders and you’re saying, OK, how many projects that were on the cutting room floor got moved above the line because of the productivity gains because 25% of our code commits were via Claude Code last quarter?
> That link is not there yet, right? I think maybe implicitly there’s more that is getting shipped. But it’s very hard to draw a line between one of those stats and, OK, now we’re actually producing like 25% more useful consumer features, right? And that line is hard to draw.
That's pretty weak sauce. I don't think that justifies the headlines that came out of it, personally.
He also said in that article that what prompted the discussion was the public statement by the Uber CTO that he had already burnt through his organisations yearly AI-budget in April. Please stop this shilling mate, and trying to hide the overall perspective between this or that word.
> The most discussed has been Uber, based on this report where CTO Praveen Neppalli Naga indicated that Uber had “maxed out its full year AI budget just a few months into 2026”, mostly thanks to Claude Code.
> Given that Claude Code only got really good in November it’s entirely unsurprising to me that a budget set in 2025 may have failed to predict demand for that tool in 2026!
What does this even mean? Is this about speed of development? Is this about headcount? LoC? How are coding agents contributing to productivity in places like GitHub, Shopify or Meta? I mean companies that already have an established product. I really wanna understand this because I'm not seeing that GitHub's product suddenly became so much better than it was 2 years ago, so where's all that productivity going?
We've also increased how much our coworkers need to read, or deal with. You can get an AI to make any point you want, so you can ignore the 5 humans raising alarms due to the 1 clanker you made say what you want to hear.
All numbers going up.
There are obviously people producing additional true value with it, probably, but that's almost certainly scarce.
Except that if your company go 20% faster than the others companies, you win market shares. But then, everyone will use the same tools and companies will be at even speed, but the tool will stay.
Now...if the market is saturated, it's useless to try to do things faster. Cheaper yes, but not faster.
> That's a _huge_ shift. Most people I know cite +20%-40% velocity with these tools, against the actual work their company cares about doing. +20% speed for +20% spend isn't going to motivate a trillion dollars a year in spending.
And most research shows people far over-estimating their own gains. Once companies start counting the actual (and not just reported) gains, the AI budgets will be more limited as people realize it's an useful and versatile additon but not replacement for most types of work
> We're not there yet. This is still the upswing of the hype cycle, and unless we figure out how to make developers 2x, 5x, 10x as productive on stuff that matters, this isn't going to play out well.
Upswing of the hype cycle while growth of tech itself is flattening, both coz of techs innate issues (which might or might not be solved, but some papers claim they are unsolvable with current approach) and just the fact the spike in growth caused so high economy cost that it put brakes on itself.
T
It really does have a particular lane for each chore, and it’s reproducible.
I have a few live websites built using LLMs and they will just go for default generic templates and colours if there's no vision.
I'm increasingly realizing this math is wrong, because LLM use is really sticky.
If Anthropic 100x'd prices tomorrow for their best model, so some companies offered 50% salary to keep 100% of your AI usage:
a) There are programmers who would take this deal. They've gotten to the point of doing what feels like even less than 50% of the work, developers were already pretty well paid, so they'll take it.
b) There are companies that'd offer this deal. Even if the only people who are taking this deal are not the best engineers, and the AI output is not the greatest, I think the last 6 or so years have seen a lot of companies realize capitalism is not as competitive as it seems.
They're not worried about putting out a worse product because... frankly, what else are you going to do? CF lay a bunch of people off, support gets awful: well you're probably not building a new Cloudflare in the next few years.
In the meantime the AI will get incrementally better, their market share will grow, and you won't be able to compete without taking the same faustian bargain.
-
Maybe I was just naive but it's making me realize how much we take for granted in the world. Both the quality and relative value of things don't have to go up over time. Quality can go down while prices go up, and nothing will really stop it. Competition should stop it, but competition is really slow and can be interfered with. And as prices go up competition gets really hard.
Or did we just get scammed?
And that's not considering that capitalism is going to do what it does best: if they really found a way to be profitable, competitors are going to fight them on pricing. Anthropic, OpenAI, Google, etcetera 's margins are a competitors' opportunities.
It's not as if there weren't chinese models nearly SOTA. Don't know where the french (Mistral) are but they may try to get in the game if there's a way to be profitable (not that France or the EU for that matter are relevant in anything tech or had any tech company besides ASML and SAP in the Top 100 but who knows).
Wait what? They spent 2 order of magnitude less on hardware.
> Gartner forecasts that large AI companies would need to earn cumulatively close to $7 trillion in AI-driven revenue through 2029, which is close to $2 trillion per year by the end of the period. In order to achieve “historic returns,” the providers would need to earn nearly $8.2 trillion in the same period.
Everyone's agency is 100% captured by belief in Wall Street. Too few <50 have any meaningful labor skills to blink.
We'll continue to have consent manufactured via media platforms and in 3 years no one will bat an eye at these companies being worth $12 trillion as Altman and Musk climb two ladders holding a "mission accomplished" banner.
I'm not even sure that 1 in 8 people I know would qualify as a knowledge worker, let alone a knowledge worker that might profoundly benefit from on-the-horizon AI. And I'm in a highly skewed population.
Most people I've met -- and again, in a pretty darn skewed sample globally -- see $65/mo as a lot of money to spend on technology of any kind and can't think of anything much they need from "a personal knowledge worker in their pocket". I don't know a single person in real life who remains excited about AI at all, and only a few software engineers who feel it'd be worth that much.
Everybody seems to be mostly confident with the "knowledge productivity" in their personal and professional life and a pretty skittish about spending in today's economy. Most would be excited about a magic new robot that affordably saved them from unwanted physical labor and drudgery, but nobody needs much real help making appointments or filling out forms or whatever.
That's not to say I won't be proved wrong some day, with some further innovations in AI products, but global-scale demand isn't waiting for anything that's been released so far.
27% of the world's workforce is in agriculture (contrast to the US where it is 1-2%). 15% in manufacturing.
A lot of people work in "services" (especially in high income nations, where it's roughly three quarters) and some of those are knowledge workers... but a huge number of them are nail technicians or hairdressers or bartenders (etc etc).
I see a lot of out of touch takes here but this might take the cake
How do you know this? Im certainly open to recalibrating my numbers which is why I asked for the source
https://www.gartner.com/en/newsroom/press-releases/09-24-201...
> "...with more than four-fifths of that growth coming from the emerging world."
If anyone thinks this is a part of the global TAM that's got $1000 a month to blow, well then I've got a stable of flying unicorns to sell you.
[1]: Berg, Janine and Gmyrek, Pawel, Automation Hits the Knowledge Worker: ChatGPT and the Future of Work (April 21, 2023). UN Multi-Stakeholder Forum on Science, Technology and Innovation for the SDGs (STI Forum) 2023, Available at SSRN: https://ssrn.com/abstract=4458221
To simplify break that 1B up into 3 levels of purchasing:
1) High-tier (US, Western EU, ANZ, Japan, South Korea, Singapore, UAE, etc) - 200-250M knowledge workers.
2) Mid-tier (Eastern EU, Latin America, urban China, India tech sector, etc) - 300-400M
3) Low-tier (Rest of the world) - 300-400M
Low-tier users are mostly free tier or heavily subsidized pricing.
Mid-tier are going to account for USD sub-$100 tiers. Probably averaging less than $50/seat.
High-tier are who you are assuming is the 1B. Users are not equal in that knowledge worker count, so there aren't 1B knowledge workers to charge money.
And when you consider Low-tier users a majority of those are free users which need to be subsidized by the High-tier users. So either free tiers get much more restrictive or the providers lose additional training data. A bulk of Low-tier users cost money and provide little to no revenue.
Edit: And think about Mid-tier and Low-tier for 5 seconds. Why would they pay Anthropic or OAI when they get get 100x+ inference from DeepSeek or Xiaomi? Mid-tier may be the only area that is willing to spend money on a US provider, but I would wager significantly on the fact that users in the Low-tier almost universally do not care.
I do think free accounts are going to end pretty soon, and some of the workers in your tier 3 will pay, but even without them this seems like a pretty healthy market size. I also wouldnt be surprised if mid tier workers are able to afford the $1000/yr vs $500. I use yearly rates because I find it easier to compare them to GDP/salary numbers
I believe we've started to see the top of what individuals and businesses are willing to pay for the current model capabilities. We are nowhere near AGI and models are really only providing significant value in niche markets currently (programming and cybersecurity). And just like SaaS the enterprise has the option to buy hardware and leverage their own models at will which can potentially offset costs and TAM as well. I have talked to a number of large financial corporations in the last 6 months and most have internal initiatives. The same applies in the healthcare vertical.
$250B per annum with AI? That's 20% of global software spend now. Sure, that's possible but that assumes current market prices hold. What if inference ends up normalizing between DeepSeek/Xiaomi & Anthropic/OAI? There's 50% of your revenue and with current costs for inference and training in the US at astronomical levels the US AI industry could also very well be setup to implode overnight.
Lastly I don't believe free can go away anytime soon because it can't. As soon as Anthropic and OAI remove that option those users will move to whatever is. For most of those users it's not a luxury to choose, it is the only option.
The financial engineering occuring right now is something I don't doubt will be text book lessons of the future. We've seen it before and I believe Peter Sorkin when he says that we will see a crash of this bubble, it's just a matter of how catastrophic it ends up being.
Basically if you're not doing manual labor, it's probably knowledge work.
Roughly 1/3rd of the working population.
Some data tucked in here: https://gist.github.com/danielmiessler/2dc039762a202b083753b...
Of course it will. The value of an employee is a multiple of what they get paid.
If you pay an employee $500k and they make $2M for your company (like Meta), then of course a 20% increase for the salary is justified if the velocity is increased 20% as well.
Imagine an employer with 10 employees paying $500k per employee and making $2M per employee in revenue (to use your numbers). They could hire two more employees and spend an extra $1M (+20%), but make an extra $4M in revenue (+20%). Alternatively, they could buy all ten employees a $100k AI subscription, for a total of $1M extra spending (+20%) but an extra $4M in revenue (+20%). You'll notice both scenarios are identical, so an employer optimizing for profit would have no reason to prefer one over the other.
The market is shrinking and saturated already and it’s not because of AI gains but geopolitical instability and supply chain issues, some of which are caused by AI spending and stupid ass PE firms refocusing on AI supply chains.
Only our pensions and futures burning.
People stopped buying shit.
- S&P has a Q1 2026 blended revenue growth of 11.3% according to FactSet - most sectors are growing, not just tech
My take is the product has been very useful for coding (PMF) for months. But it’s certainly not useful at any cost…
I think it was clearly useful for months to people who had tried it and taken the time to understand it, but now that knowledge has spread to the point where wallet holders are convinced it's not just passing fad or hype so now pmf can be "claimed".
I agree it's weird to say "those people have pmf" though, usually it's something you define for yourself
I'm not sure if this runs counter to your point or not, but: I don't see any future where LLMs aren't a core part of Software Engineering. The horse is out of the barn. There is no going back.
And I don’t even necessarily disagree with OP! It’s more like the competition is shifting so quickly that your competitors could undercut your PMF in a blink of an eye.
people -> programmers, I haven’t met a non-developer who reports getting more time out of current AI platforms than they put in. If anything I’ve anecdotally heard the opposite, introducing AI at work creates so much slop (output) it takes more time to process it all without a tangible bump in overall productivity
And that's just one inflection point. We've had several and there are many more on the horizon. So while I could be convinced that ROI is maybe not even positive today despite the ridiculous enterprise spend, it's perfectly rational to pave the way today for what's coming over the next few months let alone years down the line.
Thats why most here shouldn’t engage in the discussion - they parrot on about benefits without identifying and articulating the costs and moreover how it affects the firms financial position.
"I’ve called November 2025 the November inflection point because that was when GPT-5.1 and Opus 4.5, combined with their respective coding agent harnesses, got good—good enough that we’ve spent the last six months adapting to agent systems that can reliably get useful work done."
If I make an argument and you disagree that's fine with me, provided I didn't use misinformation or sloppy thinking in making that argument.
My root comment simply represented my two cents about the current post. I don't think anything about the post is outrageously incorrect or anything, just somewhat confusing. You're a very prolific contributor in this community and I don't think me or anyone else that welcomes your takes expects everything you write to rock our collective socks every single time, anyway.
52 on AI misuse: https://simonwillison.net/tags/ai-misuse/
149 on the unsolved challenge of prompt injection: https://simonwillison.net/tags/prompt-injection/
40 on slop: https://simonwillison.net/tags/slop/
If you want an "LLM evangelism blog that rarely, if ever, has any critical analysis that isn’t pro-industry" there are plenty out there. I'm not one of them.
Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products.
(And that's after taking into account the METR paper that says engineers over-estimate their productivity with these tools.)
I have plenty of doubts about AI delivering on its promises outside of coding. I don't write about AGI because I think it's science-fiction hysteria. I write about slop precisely because it represents a mis-use of AI that demonstrates people completely misunderstanding what it's useful for.
"Many people still think AI coding agents are slop on steroids despite all the current hype around AI actually shipping functional products."
Oh yes, tons and tons, especially on HN. But the plural of anecdote is not data. Enterprise spend speaks for itself. You are using AI-coded functional products all the time. Do you want like a diff history for the Google codebase or something?
>"These are tools which burn vastly more tokens, but are also quickly becoming daily drivers for the work carried out by extremely well-compensated professionals."
>"Somehow this fragment turned into headlines like Uber’s COO says it’s getting harder to justify the money spent on AI tokenmaxxing, because the market for stories about AI failures remains enormous."
Yes, it's just the yearning for AI failures. It couldn't possibly be runaway costs, record revenues, and massive layoffs. It couldn't possibly be that these tools are lighting dollars on fire by people already paid significantly well and not producing any increase in "value" for it (I recognize that output is 100x but outcomes are flat by all measures).
[1] https://cmr.berkeley.edu/2025/10/seven-myths-about-ai-and-pr... [2] https://futuretech.mit.edu/publication/crashing-waves-vs-ris...
It's a natural response for society to despise these people who have such contempt for us. It's almost embarrassing these days being at a social function and telling people I work in software, it's got a negative stigma almost like working in gambling or the military.
How enormous? 1 trillion dollars, 2, 10 trillion enormous?
I don't see the business model working. My closest friend actually does automation software for large companies.
He does not use Claude or openai at all. He primarily uses gpt 120b on cerebras and glm-5.1 for heavy thinking work. And some other small models for various tasks. All open source.
And these systems are extremely useful for the businesses and are able to run fully automated pipelines that are very stable and fast.
We discuss this a lot, and we both think any business doing heavy agentic work on Claude and openai just aren't aware of exactly how good and cheap open source has gotten on the last year.
So... once the legacy businesses and developers catch up, won't Claude and openai be unable to recoup their costs?
At work I mostly use Claude Code and a bit of Codex; personal projects are OpenCode and honestly I prefer it.
Not sure about other domains though.
Same. It's a nightmare from a Porter's Five Forces perspective.
There will be a ton of businesses competing in this space, and there will be something of a moat due to how capital intensive the business can be, but there will still basically be infinite competitors.
Great for consumers.
Like how snapchat kind of fell off because the feature could just be a subset of instagram
It seems like it would just become a commodity like EC2
Most of the money right now is in coding. Openai and Anthropic just have to be 6 months ahead of SOTA open source models and they'll capture most of the enterprise and dev market
I highly doubt I'll ever use Claude again.
I think you are wrong about Claude being any significant level better
Currently, the difference is substantial, but what happens if capabilities saturate?
This is transparently false, because the best "model" is still competent human developers. They're just more expensive. If you're willing to use current LLMs at all, it means you're willing to sacrifice quality for a better price, and your disagreement with the comment you were replying to is entirely about what the optimum tradeoff is.
Unless ofc there was an actual speed difference, only reason I'd be willing to go with a worse model couple of percent worse than current best model is if the speed was at least 5x higher. Looking forward to kimi k2.6 offered publicly by Cerebras
That's fine. Other people may not want to pay 300x more and will rather make do with last year's SOTA.
> For coding you always want to go with the best model
Maybe you meant "For coding I always want to go with the best model"?
Currently I have no way of telling if big changes in their rankings are caused by a single "whale" switching providers, or if it's a more meaningful trend.
And this is why many companies go out of business. You always want the best bang for your buck, sometimes this is the "best model" and sometimes it is not.
Ofc again, can be convinced to switch if there's however a clear speed difference, like 5x+ for a open source sota even if it was SOTA for 6 months ago
Will this always be true? There will never be an event horizon/point of diminishing returns where something not-bleeding-edge is "good enough" for 51%+ of users?
Oh, hey, I recognize you. Thank you for the very forward and thorough orbital sander recommendation at Home Depot. That's exactly what I wanted to deal with on my holiday weekend. You just know so much about this and the rest of us are simple passersbys.
And also, people have it wrong… their models are not the main problem anymore. It’s the RAG
An agent harness with access to a good search tool is a much more interesting thing than 2024-era RAG systems.
I agree with the common trope that open models lag behind by about a year, but something magical happened just around a year ago when the state of the art models became extremely useful. By this reasoning we're about to see open models perform well, but I'm afraid there is more to it than just waiting for another revolution around the sun.
Note, my application is coding assistance. Open models can be great for other purposes.
In latest experiment I used opus for implementation plan then used cursor composer 2.5 for execution.
I must say that combo is really good. Main drawback of claude code is that is super slow. So when paired with composer that is super fast it flies.
But there have been very good open source office apps for decades and few enterprises use them, so perhaps this is just the nature of B2B purchasing committees and 'nobody getting fired for buying IBM.'
“Tokens” don’t have an intrisic cost or value. Saying that I used $2,180.16 worth of tokens is like relying on the salesperson to convince me I’m getting a billion dollars worth of pots and pans for $19.99.
I think it’s funny how we are throwing critical thinking out the window when it comes to evaluating biased sources of info.
I spent $200. If I had been paying API pricing it would have been $2,180.16. The article is about how enterprise customers get charged API pricing, which means if I had been employed by one of those companies I would have cost them $2,180.16.
What am I missing?
We have no market convergence on tokens yet (and it'll differ between LLMs), so it's impossible to say what value you got for your $200.
But to your point, re-reading the article, this is not what Simon is saying at all; he's just pointing out that he got to use ~$2000 "worth" of tokens on his $200 plan. Which makes total sense! Subscriptions are sticky, that's why the entire software industry moved towards subscription models (as much as we hate it); the person paying $200/month is more likely to stick around than the person who paid $2000 using the API.
Simon is saying that companies are (today) willing to pay API prices for tokens which is as good as any determination of value.
You seem to be suggesting the price of tokens is entirely disconnected to the cost of providing the service? I don't see much basis for that assumption.
Does that mean you'll be saving $99k?
It sounds an awful lot like the mark-up to mark-down scheme where the price stays the same.
Also, to just color in the picture here, as I haven't seen it mentioned elsewhere, there is a very large Saas company at the moment who has given everyone unlimited tokens on Claude. And they have a dashboard showing who spends the most. So the "budget" went from about USD500 per per person (split between Claude and cursor) in Jan to... Well a soft limit of USD100k... Per month... Per person.
People can still see the top line sticker price on their spend, but honestly I can't believe that the Saas is paying that full price when the invoice comes in.
That said, there are some finance reports which are probably dropping soon where we will find out!
I shared that assumption until yesterday, when I found out that it wasn't holding for LLM pricing from OpenAI and Anthropic. That's what inspired me to write this piece.
I think those token leaderboards are an obviously terrible idea and will go extinct very quickly now that people are paying attention to costs.
> I shared that assumption until yesterday, when I found out that it wasn't holding for LLM pricing from OpenAI and Anthropic.
This reads like GP saying "enterprise never pays sticker price" and you responding "I thought so too until I saw the sticker price".
Is there some info you have that you can't/didn't share? Your article doesn't offer anything beyond the above.
> With the pricing change, customers of Claude Enterprise, a two-year-old bundle of products meant for large companies that now includes Claude Code and its work assistant, Claude Cowork, will have to pay for the amount of computing capacity they consume while using the software on top of a monthly flat fee of $20 per user, an Anthropic spokesperson confirmed.
There was a Hacker News thread the other day where a bunch of people confirmed that their organizations had seen this too: https://news.ycombinator.com/item?id=48278610#48280906
Could be fantastic for small shops while it lasts. The big guys have to pay 10x for precious tokens.
your point is large players won't pay those prices at massive volume. ok
The point being made above is that API pricing is calculated... somehow... seemingly arbitrarily. Possibly untethered to the infrastructure costs entirely: which would be the basis of any 'value', however that holds the labor theory of value, which isn't accurate either. So how do you accurately price these tokens at all (other than through price-discovery: which is slow, messy and fuzzy)?
Like anything else in the economy: at the point where enough customers can pay you, and not enough will go to the cheaper competition.
> (other than through price-discovery: which is slow, messy and fuzzy)
I notice a distinct lack of reading or comprehension (from everyone around me now, not just this comment) which worries me. I worry if LLM's are to blame. No one reads anymore...
As with pretty much anything priced on volume/usage.
Enterprise deals are negotiated ad-hoc, the listed pricing is simply a jumping off point for the final negotiated discount.
If you’re going to give 20,000 employees Claude code you are not going to be spending $1B per year on Anthropic tokens as if you gave everyone an individual API key. Just as Anthropic isn’t paying AWS SES $10,000,000 to send 1 email update to their massive user base when the next Claude version drops.
Going to be interesting to determing the metrics we give to engineers for determining whether the spend on this is worth it. Measuring PRs, lines of code committed, commits fully generated by agentic workflows, etc.....
Do you have any numbers or reports to back that up?
How much do you think emails cost? That number is just so far off?
But besides that, running SES is also quite a bit cheaper than SOTA ai models with high demand (and comparatively) no competition. And quite a bit more pressure to make money (soon).
edit: I missed the "enterprise" feature matrix with the usual audit/compliance stuff to force the biggest enterprise customers onto enterprise plans. Otherwise the "teams" plan is much better value for any business.
orig-continued:
https://claude.com/pricing/team
Teams premium is "Everything in standard, plus more usage*"
And from my experience, it's a very generous usage, I've only hit the limits once or twice, and both times required multi-boxing agents.
I could single-window agentic development all day on opus-4.7 auto-mode without hitting limits.
If you're a business using claude, then that seems like the right plan, the enteprise/API plan seems more suited to where your product is built on top of the agent themselves, so seats/limits aren't really meaningful?
Yes, value is hard to calculate, but luckily market pricing mechanisms exist exactly for this purpose. There isn't a better number to use than what people are willing to pay for them.
So he's saying that on an enterprise plan, he'd be spending $2,180.16. He's not paying that much, but enterprises are.
In contrast, imagine if we had the same AI 20 years or so ago. Could AI really write Jersey? I guess not as people were still trying to understand JAX-RS. Could AI really answer all the questions about React? I guess not as React was just invented. Would we use 10x fewer people to build out infra on the public cloud or the entire so-called Big Data platforms? I guess not, as they were still rapidly evolving and we'd need so many engineers to explore so many different possibilities? Could we use AI to build our ML ecosystem with 10X fewer people? I highly doubt so. Heck, 20 years ago R was all the rage and Python's ecosystem was not mature at all. Oh, and mobile computing, could AI lead to 10X fewer people to build all the mobile apps and the underlying infra?
> Would we use 10x fewer people to build out infra on the public cloud or the entire so-called Big Data platforms?
No, cannot solve core problems, makes a mess at scale
You are right about the incremental work. But most of the work is historically incremental imo, only few positions are R&D.
I'm skeptical that their current price raise is sufficient, and I'm also skeptical that most users/businesses will accept more significant price raises that will be needed. Especially for individual users, $200 a month is already incredibly expensive, I really don't think most people are going to be willing to pay more like $1000 a month.
A single 3D CAD license pack for the guys in our R&D group costs multiple thousands of dollars per seat, per month.
It's about time software seats get some love too.
[0] https://winchdesign.com/ [1] https://www.superyachts.com/directory/1516/winch-design/flee... [2] https://www.autodesk.com/design-make/articles/naval-architec...
Those tools are used in ways that they're integral to processes. They have their equivalents of ticket systems that are linked to code repositories with LFSs and bunch of IDE type tools and automated and manual test systems and build systems. Their equivalents of PR discussions and Selenium screenshots needs to check all boxes in the right ways for legal and traceability purposes.
Without all that might be $175/user/month but you're not shipping apps with just vi and bare gcc.
You're right, Linus uses Emacs.
I might agree "AutoCAD" is the current level LLM's are at, but wait until your design departments discovers "Revit", its another ballpark (in wasted cots, engineers on site still get "clashes").
Revit costs are high, and the end results are marginally better - but local LLM's tokens are cheaper 24/7 at "AutoCAD" level - "Revit" level tokens will make Ubers CTO/COO weep harder than they already do. While producing results no better than "Revit" does (engineers still face "clashes").
For a pretty funny comment about pricing.
https://www.reddit.com/r/chipdesign/comments/1ajrli2/cadence...
edit: typo
What does ICP mean?
So the author claims he's getting $2000 per month worth of frontier AI free of charge. Ok. If he's been doing that for 6 months that's $12k. What has this produced concretely? For $12k you can find a used car in decent condition. Heck for $1200 (his actual out-of-pocket spend) you get a brand new ebike! (on which you could put a pelican and make a photo of both if that's your fancy). But here it's unclear what has come of it.
(It's mostly open source, you're welcome to dig around in https://github.com/simonw and https://github.com/datasette if you like.)
My time as an experienced software engineer is worth a lot of money - a whole lot more than $12,000 for the past six months.
As you might suspect, this is what I have an issue with. Without LLMs, isn't it possible or even likely that that code wouldn't have been written at all, and wouldn't have been missed? If LLMs are mostly used to produce throwaway prototypes then it's a stretch to say that's money well spent.
If indeed it let you advance your main product much faster then sure it's a different story. You're the judge of that. It's hard to see the impact from the consumer side; everything is still broken and no extraordinary app seems to be emerging. Maybe it's just a question of time. We'll see.
Open source software changed the world. AI that will cheaply write whatever you want in a few days will also change the world.
And that is likely a fair assessment, though I understand perfectly the feeling that you have that you are accomplishing great(er) things thanks to AI.
I take some reassurance from knowing that they are indeed used by real people to solve real problems though.
From this I assume you think that what the llm has generated is as valuable as your own work generally is. How do you even calculate this?
(I have a feeling if I could say "and I closed $2m in sales with the software I wrote!" people would find a way to say that didn't mean anything anyway, because how can I prove I wouldn't have made those sales writing it by hand?)
> Stories are circulating of companies surprised at how expensive their LLM bills are becoming from usage by their staff
> Enterprise customers are now paying API prices
How long before enterprise customers start to question the bill? Anthropic goes from not making money to doing pricing shakeup, and now they are making money and the biggest spenders are shocked at prices.
Seems like things are still very uncertain.
But memory costs are going way up. And both OpenAI and Anthropic bumped up the price of their frontier models in April.
Supply will eventually catch up with demand. Then the prices will come back down.
Legalities aside, you need to look not at the model quality but at the infrastructure needed to scale these models from tens (now) to hundreds (soon) of millions of users. Only a handful of companies actually have the resources and funding to do that. That's what these huge valuations are based on. These companies are gearing up to scale to these levels. That's why they are spending on data centers. Whoever has access to those data centers gets to tap into the revenue stream of people using models running on those.
The market for frontier models is roughly split between OpenAI, Anthropic, and Google. And then you have companies like X/SpaceX, Amazon, and Microsoft being more successful with their infrastructure than their AI products and companies like Apple, Meta that have the money and the aspiration but are so far not really managing to be very successful with their AI strategies.
Deepseek is just very poorly positioned to capture a lot of the enterprise revenue in the EU or North America. But they might become very dominant outside the US/EU. And of course China itself is going to be a huge market and equally unlikely to want to be depending on US owner AI companies.
Personally I see no difference between China and America in terms of risks of them embedding "backdoors" so to speak, but I disagree when people claim that open-weight models are obviously safe just because they can be ran locally.
I'm building a product right now with some AI coding (despite my negative sentiment about AI in general they are useful). I am both the product person and the engineer, and I'm pretty decent at using it, so according to the hype I should be seeing like a 10x speedup. I am not seeing that. It's definitely faster, but there are also days where I'm stuck cleaning up things after going too fast for too long, or periods where I need to put the software in front of people to get real feedback, or even periods where I just need to use it extensively myself to find the pain points and bugs. I just don't see this "running circles" once you get past an MVP and you actually need to build something secure and not embarassingly broken.
If not lower priced chinese offerings will be better as its cheaper per token - giving you more attempts to offset the variance.
My feeling on the former is no... I believe they tried really hard but they've settled on pure marketing now to attempt to fight off the chinese with perceived superiority in quality.
The assumption here is that this is a positive thing.
But this very well could end up being a major negative long term by increasing the cost per user, reducing margins.
More usage = more cost = less profit.
It's not obvious that more usage is good. It's only good if revenue per user increases more than cost does. I'm skeptical about that.
That's why it's so important for these labs that they're selling API tokens for more than the compute+energy costs needed to generate them.
Every indicator I've seen is that they do have a positive margin on that. If they don't, they're screwed.
The customers of these tokens need to see returns on their projects that exceed the cost of financing.
Laying people off only goes so far.
If enough said firms don’t see enough value given the price of frontiers they will cancel and consume open source. This is the risk the frontier labs are exposed to.
Dario telling Dwarkesh three months ago that they have a margin on inference: https://www.dwarkesh.com/p/dario-amodei-2?timestamp=3528.0
They had all the incentive in the world to say "I'm not going to talk about that."
Ahhh the classic startup term that's definition is nebulous. But also, since when does any definition of product/market fit mean a product is profitable? And profitable in what sense? Unit economics? Overall company?
It's a great hook to build an article around. My core point is more that April 2026 was the point when Anthropic and OpenAI finally appeared to have figured out a credible business model.
How so? What's specifically changed? We still don't know what their unit economics are and everything you've documented is basically speculation at this point.
1. Both Anthropic and OpenAI significantly increased the prices of their latest models. They're clearly not trying to offer the lowest-price-possible to drum up demand any more.
2. Both Anthropic and OpenAI no longer let enterprise companies buy discounted almost-all-you-can-eat subscriptions. Those big enterprises are now paying full API prices.
3. According to reasonably well-sourced leaks, Anthropic may be about to have their first profitable quarter.
And I didn't even say "profitable", I said "credible business model". I think getting companies to spend hundreds of dollars per month per seat, WITHOUT crazy subscription discounts, is a credible business model.
I've been calling that out for a couple years now. LLMs best and most viable use case is still just as a dev tool. Even for non-programming tasks, I still get better results from the LLM if I instruct it to write code to do the task...look at Claude Cowork for example, it's everything I used to do with python myself. It's not really a novel capability, it's just using python & bash for automations that any sysadmin has been doing for decades. Yeah, that's valuable for a non-techincal audience but is it $1T valuable? I don't think so.
When has an IDE or other dev tool ever commanded a $1T valuation?
These things get lost in discussions because people conflate "overvalued" with "not useful." LLMs are useful, particularly as dev tool, but Anthropic & OpenAI are definitely way overvalued.
Anthropic and OpenAI have shown people want a tool for task offloading, driving predictable token consumption and justifying the math, so long as users stay in that dynamic.
However, knowledge workers using these tools daily are getting exhausted with them. Outputs come out polished but hollow. Talking to a frictionless, frame-completing model all day drains you.
If user behavior drifts away from assistant usage because of that, per-token math implodes. The valuations we're hearing about all the time rely on usage compounding daily. The fatigue is a timer running against that compound.
Anthropic's Constitution is the closest hedge out there, I think. Installing an identity structure into the model through training. But it's still assistant-first, so the fix there is only partial.
I've spent the last year running a product that flips the architecture so identity is primary and the assistant role is secondary. Same frontier models, completely different conversational quality. The fatigue property doesn't really show up.
Whichever labs figure out how to install real identity natively in the weights are going to be the ones with PMF in the next phase.
You may want to get one of them to check the math on that :p
Firstly, if the user is asking for things where AI can link to products or services to buy, there's a very good relevancy, much higher than in other types of ads.
Secondly, since the AI often takes time to compute answers to user's questions, they could be shown ads while waiting. People could perhaps be less annoyed by this than some other commercials since they know the break has to be there anyway.
(First idea is something I came up when asking Claude to compare some products, or ask for help in lawn care. Second idea was by a colleague.)
it is only true for USD. for example if you pay in euro, this is actually more expensive. kind of makes no sense, because it translates to $1 = €1
Other than the hosting providers, I am also yet to see anyone directly making money from their OpenClaw agent.
Ran `ccusage` on my Claude Code logs.
- Total tokens: 22.2B
Without current Claude deals, my personal cost would have been *~$112,000*.
While the big guys will argue they’re worth trillions expect others to drop chaos booms showing their NPV may be effectively zero.
How many tokens is that, input/output-wise?
(a) I'm curious if you feel like you got $2000 worth of value out of them in the last month?
(b) I'm also curious if you would have gotten similar quality out of a slightly lower-cost provider of an open-weight model? (e.g. Kimi K2.6 and DeepSeek v4 Pro) and what the spend would have been for that.
I myself have managed to spend not quite $4 on OpenRouter and have felt it was very worth it; I just have much smaller, or more targeted requests I guess. (Lately, adding features to a static site generator in Python, or setting up log forwarding via a docker compose file)
Input tokens: 52,545,485
Output tokens: 5,767,253
Cache create tokens: 5,112,029
Cache read tokens: 1,475,069,465
Total tokens: 1,538,494,232
Total cost: $1,199.79
OpenAI Codex: Input tokens: 52,598,013
Output tokens: 4,681,867
Reasoning output: 2,091,063
Cached input tokens: 1,153,844,864
Total tokens: 1,211,124,744
Total cost: $980.37
I'm confident I got value out of OpenAI - I've been mainly on Codex for the last few weeks.Not so sure I got that value from Claude, just because I've been using it a lot less and somehow the price came to about the same as OpenAI.
Given the code I've been able to build in the past month I genuinely do think I got value for the API price version, and (don't tell OpenAI or Anthropic) I think I'd have paid full price.
I've not spent nearly enough time with GLM-5.1 and co to compare, but I do know that the prompts I'm using with the agents are not prompts I would have expected to work just three months ago.
When I account for the amount of time it saved me there's no question $2,000 was worth it.
Personally, I've probably spent $60 or so on OpenRouter in the last month or so and got a working project out of it that it would probably have taken me a fortnight to knock together (which is inevitably an under-estimate because it covered things I'd have to learn but K2.5/6 already knew). There's an orders-of-magnitude gap there.
The impact of AI in other fields seems to be muted.
Software development has the huge advantage that mistakes and hallucinations are very easy to spot: the software works or it doesn't.
Spotting errors in a research report or legal brief is a whole lot harder!
But... non-software professionals spend a huge amount of their time on tasks that can be safely automated - reformatting documents, extracting numbers from PDFs, all kinds of flavor of data entry.
Learning how to use a tool like Claude Cowork can take a big dent out of those.
Do we not care about code quality, maintainability, performance, extensibility, or understandability anymore? Honest question, not a gotcha, it's just previously getting software to pass all the tests was a small part of what we would consider "working" or perhaps "good" software. Maybe that's different now with LLMs, idk. Maybe we need automated checks for these things as well, like not compiling until the code quality is good enough to let the agent finish it's loop.
Yes, we should care. I've been writing a whole book about that: https://simonwillison.net/guides/agentic-engineering-pattern...
This isn't me being a doomer I just don't know. Can we look at Q2 profits and draw hockey sticks yet?
Remember people are boasting how much their expenses are. That is where we are in the bubble/new paradigm.
You think this is fantastic deal only because they use similar like tricks where they inflate the price and tell you something supposed to cost $1000 but they have this today promo for $100.
I was there too and paying for a while. Few weeks ago I tried DeepSeek V4 Pro - expected its gonna be shit but its actually pretty good.
The deal is I pay daily ~$1 for DSV4-pro for ~100M API token usage. And they probably not getting broke because >90% of those token in practice is cache read and they very well optimized for that.
So ballpark same price per parameter as Simon.
Operating profit is both post depreciation and fees paid to third parties for hire. So aside from shenanigans like RSUs and financing interest that's already somewhat close to actual economics.
Meanwhile we've got commenters here talking of 5-10 trillion with a T revenue shortfall.
Those are very different takes on reality
So many startups trying to automate sales, but somehow the two biggest frontier labs have decided that the best GTM strategy is firmly human-in-the-loop.
However the valuations are still far far away from actual sanity
I use glm-5.1 and occasionally deep seek v4.
They are as good or better than Claude's latest models.
And significantly cheaper. I've converted 3 of my engineer friends as well. All three have dropped their $200 month plans they had with anthropic.
We've all been a bit shocked at just how good these models are now.
If you "have" tried GLM (I specifically find it shockingly good for code). Did you not think it's not competitive to Claude, and why?
It's good enough for personal stuff. It doesn't compare to the latest Opus I use at work. You can certainly argue I don't need Opus for work, but there is clearly a difference.
Also, at least with z.ai, GLM-5.1 is s l o w! After using Claude at work, I get really impatient with GLM-5.1 at home. When doing "true" vibe coding (i.e. not really examining the code), Opus is a ton faster (easily 5x).
But yeah, I'm not willing to personally pay for the frontier models. I won't even renew my annual Z.ai plan - it's become too expensive.
Also, and I know you may not want to answer. But could you give me an idea of the type of thing you found glm to be worse with?
I think I've been fairly unbiased in testing a bunch of different development tasks. But am curious if maybe it performs well for some stuff and not others. So if you could share what you feel it's worse at.
Also are you an experienced developer or less experience?
When DeepSeek V4 Pro came out, I had been mostly coding with GLM-5.1 on a Z.ai coding plan.
I had a large analysis task on a relatively complex codebase. I decided to try the models out.
GLM-5.1 did acceptably but got a few things wrong (easily corrected) and took quite a while to get there.
Opus 4.6 burnt through the US$10 budget I had given it in about 10-15 min, without ever returning from the first prompt.
DeepSeek V4 returned a full analysis within 2-3 min, and I carried on all the way to implementing the feature I was after. Total cost less than US$1.00.
I now mostly alternate between GLM-5.1 and DeepSeek V4 Flash, with an occasional dip into V4 Pro for more complex analyses.
right now everyone is using latest and greatest to do dumb stuff like that. that would change fast if companies start caring about costs.
Any org with more than 150 users aren't on $200/month plans, they are forced into API pricing + $20/month/user
For individuals and orgs small enough to get to use the subscription plans, that's all well and good until usage limits keep going down, or cost goes up. If you compare the usage you get on $200/month maxed out vs. what that would cost at API pricing, the $200/mont plan is an absolute steal. I doubt it will last long.
On the plus side, I'm happy I'll have a nice hay barn when the local half-built AI data center is abandoned.
Recent conversation here on that topic: https://news.ycombinator.com/item?id=47062534#47063134
But I also think that their API token pricing represents a real margin over the inference costs for serving those tokens.
Both things can be true at once.
But that's the point of the article. Enterprise plans are starting to get API pricing, not the subsidized subscription pricing.
Intelligence is a universal good, it can apply to anything, and no, "human intelligence" is not the only form that is useful nor special. There are limitations to AI but also huge advantages, and its obvious that the advantages are worth paying for, given their revenue.
Many of us are either openly having our performance reviews tied to AI use, especially at larger enterprises. Whether that's measured by sheer token count or just "how many of your tasks are you using AI for these days" (combined with the implication that question carries at many orgs which are heavily invested in AI).
I don't think that's the case. I think the token leaderboard thing (which is clearly ridiculous) affects a tiny portion of companies and is already going out of fashion.
We're also in a place where a lot of the usage guidance around these tools is still nascent. People are cowboying a lot of stuff, even as larger companies start to organize AI policy/safety/responsible use working groups to try and policy around the shortfalls of the technology.
IMO: if this technology persists, and if we figure out a way to use it in a broadly safe way, the value proposition will probably trend down rather than up, at least on the code generation front.
As a research tool, it shows some promise, though I still find the ethics of the technology disgusting.
There's a whole bag of clever tricks you can play to juice short term results leading to an IPO that may not work longer term.
I'll believe they've found product-market fit when they have a product. Right now they're selling the infrastructure, in a highly subsidized and undifferentiated way (at least over a sufficient long period of time of, say, a couple of years).
Just imagine how funny it will be if it comes out that big labs were doing some fancy maths to count the 2k$/month in their forecasts ...
Is that quarter same as any other quarter in terms of infrastructure costs (e.g. are there any temporary discounts happening coincidentally)?
Funny to see the change of tone - a lesson for people not to get too ahead of themselves.
You financially benefit from stuff like agents. Of course you will be the last to admit publicly when things aren’t quite heading in the right direction. The gap between hype and reality is ever increasing.
There are still several open points (eg.: code churn, maintainability, subtle bugs human will never do) that everyone with a minimal programming knowledge that seriously used a LLM agent knows about but somehow none of these "big influencers" never mention (or just saying "it's your fault").
I notice this all over the place. Many people hate AI and want it to fail, and they're willing to invent misinformation if it supports that idea.
I repeat: a CTO saying that they spent their entire AI budget for 2026 when that budget was clearly set in 2025 before anyone knew what those November models + harnesses were capable of is entirely unsurprising. Any analysis that doesn't also point out the difference between 2025 and 2026 era coding agents is either ignorant or deliberately misleading.
(We still don't even know what Uber's planned AI budget for 2026 was. They didn't reveal that when asked - in https://www.theinformation.com/newsletters/applied-ai/uber-c... it says "He wouldn’t disclose exact figures of the company’s software budget or what it spends on AI coding tools").
https://www.businessinsider.com/uber-coo-andrew-macdonald-ai...
You know their business is literally correct interpretation of the C-Suite statements.
Hah, I just checked their homepage and here they go again amplifying that COO fragment from that podcast:
https://www.businessinsider.com/tokenmaxxing-debate-uber-exe...
> "That link is not there yet, right?" Macdonald said in comments that went viral, racking up over 2 million views on X. "I think maybe implicitly there is more that is getting shipped, but it's very hard to draw a line between one of those stats and, 'OK, now we're actually producing 25% more useful consumer features.'"
Yeah, something going "viral on X" is clearly a sign that it's quality information!
For someone who cares about media hype - https://hn.algolia.com/?query=author%3Ahansmayer%20hype&type... - you don't seem to be very discerning with regards to this particular story.
In hype-driven markets, you cannot be certain of that.
Let's take a view that the author is right: coding agents and their associated harnesses were the inflection point for some degree of profitability and widespread consumption, and that these tools are now yet another SaaS subscription or API bucket expense to bake into every single developer (or developer-adjacent) in the organization alongside your collab suite, HR seat, CRM seat, design seat, etc. To be fair I honestly think that's a safe assumption to make for highly technical firms whose image is derived from remaining on the cutting edge of things.
That begs the following questions, which we won't know until IPOs start happening:
* Are subscriptions profitable, or just API consumption?
* What's the run rate when we just consider subscription-based usage like Claude Code and Codex? What about API calls?
* Is there any profitable pathway forward at which enterprises can get unlimited usage but at fixed rates via subscription?
* What does customer churn look like for subscription users versus API users?
We also have a number of questions for customers that I suspect we'll start seeing receipts for in the coming months, at least from the early adopters:
* What was the net gain (loss) from leveraging coding agents?
* What's the cost of a developer with or without access to a coding agent + harness? Is it cheaper to hire an outsourced worker with a coding agent subscription, or a domestic worker without one?
* At what point does further AI spend result in diminishing returns, i.e. where's the 'sweet spot' for spend?
* Did AI boost actual revenue and outcomes, or did it just gamify KPIs?
* What roles or work did AI actually replace, versus merely displace during the hype cycle?
Not to mention the questions regarding the technology itself:
* Will we develop the means to run foundational/frontier models at edge using less resources through some existing (e.g. distillation) or new technology, thus cutting off the profit centers of these firms?
* When the market mismatch between supply and demand is resolved, won't it be more affordable for consumers and companies to operate their own AI infrastructure rather than support further centralized buildouts?
* Will coding agents improve to the point of being able to bootstrap and self-orchestrate on edge/consumer hardware without substantial technical expertise, or at least improve to the point that traditional IT teams can securely operate them internally without an expensive subscription or API token bucket?
All of these will influence the long tail of this bubble, because it is a bubble at this point. Even if these companies are indeed profitable thanks to the coding agent inflection point, there's still so many unanswered questions about utility beyond coding that it's impossible to extrapolate a future. If coding agents are indeed the extent of utility for profitability, then there's no possible way these entities will recoup the investment already sunk into their infrastructure buildouts. Even if more profitable uses are discovered, does this offset or replace the firms disappearing due to AI speculation and their associated contributions to the economy as a whole (RE: the consumer compute industry at present, higher energy costs due to datacenter builds, opportunity cost from harms to local infrastructure from haphazard builds, etc)? Should these firms indeed be runaway successes and immensely profitable to the point of paying off their investors and growing the larger economy, does this end up stifling innovation in a world where most new ideas are fed into LLMs for R&D that are then controlled by only a handful of companies and immensely wealthy people, via systems that are easily surveilled and stolen from without recourse?
So many, many questions yet to be answered. Betting the farm because of coding agents is one hell of a gamble.
No, its more like their own leak to WSJ and according to Ed Zitron -> seems to be heavily engineered via non-GAAP practices such as counting potential, but not realised revenue as actual revenue - the stuff for which I would be arrested if I did it at my company.
Also it appears according to Ed's analysis - strangely they seem to be projecting only that one quarter as profitable - potentially to calm the investors ahead of the IPO. Investor fraud anyone?
Please don't forget that Ed's entire brand identity is now 1:1 with exposing "AI" as a giant, unmitigated failure.
That's a very specific flow chart to hook your caboose to when none of this is even remotely close to endgame.
There will be big parts of what he says are true once the rubble settles but it will not be anywhere near what he is predicting. How that will shape out may not be great for the average person, what money shuffling tricks will be used? But it won't be a total wreck.
Honestly, I think it's very short-sighted to assume that all of this will be seen as any kind of wreck in the long term.
Normies are still catching up and reacting to chat-based LLMs.
HN types are further ahead of the curve, but still catching up and reacting to agentic coding and design workflows.
What often gets completely ignored is that entirely new modalities for how the underlying tech can be applied will continue to be demonstrated, and those will once again cause new ripples of excitement and disgust.
There are companies building world models and systems for protein discovery. Comparatively speaking, these approaches are barely in the zeitgeist today.
Deciding that we already have the data points we need to extrapolate how all of this plays out is like someone in 1974 deciding that microprocessors are just for accounting and inventory. Don't be that someone.
> According to a person familiar with the company’s internal analysis, Cursor estimated last year that a $200-per-month Claude Code subscription could use up to $2,000 in compute, suggesting significant subsidization by Anthropic. Today, that subsidization appears to be even more aggressive, with that $200 plan able to consume about $5,000 in compute, according to a different person who has seen analyses on the company’s compute spend patterns.
The load-bearing detail here is if that means $2,000 of internal server+electricity costs, or $2,000 if they were to charge at their API pricing instead of the subscription cost.
The latter is how I understand these things to work right now. If it's the former then yeah, Anthropic are losing a TON of money on those subscriptions.
https://www.reuters.com/commentary/breakingviews/anthropic-g...
If you've ever been at a startup, this is exactly what it looks like when you go from not having product-market fit to having it (though with a few extra zeros on the end compared to most).
Hell, say it did, how would you possibly know?
Feb 12th 2026: https://www.anthropic.com/news/anthropic-raises-30-billion-s... - "Today, our run-rate revenue is $14 billion, with this figure growing over 10x annually in each of those past three years."
Apr 6th 2026: https://www.anthropic.com/news/google-broadcom-partnership-c... - "Demand from Claude customers has accelerated in 2026. Our run-rate revenue has now surpassed $30 billion—up from approximately $9 billion at the end of 2025."
All three of those are official releases from Anthropic. You can choose not to believe the if you like, but since they plan to IPO this year it's in their interest not to get caught lying to potential investors.
Maybe, maybe not. We haven't seen that S-1 yet. All we have is the 5B in lifetime so far. PLUS - revenue quadrupled or not, it only matters if their costs did not expand at the same rate or more. Revenue is not profit.
Revenue is not profit yet the discussion in this particular thread is about revenue.
Ever heard of Enron, Theranos, SBX ? They were all hiding in plain sight - who could've thought they were frauds?
No, at this level of capital involved, and so much opacity around the company financials, it's a perfectly reasonable assumption.
It's a funny metric considering Depreciation is a huge cost for them.
"We are profitable when we don't count our expenses"
Those GPUs are very expensive.
Inference is expensive because a GPU can only process a certain amount of requests in a given timeframe. Remember that Anthropic is constrained in compute.
If they are constrained, it means that those GPUs are not idle. If they have more customers, they will need more GPUs.
If they have to play silly games using EBITDA to be "profitable", then it means that they need to ramp up prices a lot more than they already did.
Which is why in these discussions I always say that inference is also extremely expensive. Too many people like to pretend without any evidence that inference is cheap.
Language models don't wear out the same way; upgrading is a choice.
You can "just not update an LLM" in theory. But if your competition updates LLMs, and gets more capable, more efficient LLMs, and you don't? They get more capable "expensive tiers", and cheaper "cheap tiers" of LLMs. What are you going to do then? Bleed userbase and die?
The move to buy up ram is straight out of a industrial organisation textbook.
Back in 2024 their CEO claimed training costs would rise to $10-100B in the next years.
https://www.tomshardware.com/tech-industry/artificial-intell...
I assume this is the quote you're referring to from Davos?
"I have engineers within Anthropic who say I don’t write any code anymore. I just let the model write the code, I edit it. I do the things around it… we might be six to twelve months away from when the model is doing most, maybe all of what SWEs do end to end."
that was in Jan, he said "might" and he said 6-12 months. Yes! Let's hold him accountable for saying reasonable things!
Indeed. That's why serious people are very careful, even if they are not running a company supposedly worth 1T USD
> He is forced to do these predictions to know how much compute to buy in advance
Ah well, that explains it. For my companies next quarter, I'll just pull some random numbers out of my ass so we can make plans with material impact to company business based on that.
10x revenue growth per year, even more this year...his predictions about when agents will claim SWE e2e work are his speculations, relevant because people care about what he thinks as he is closer than anyone to the leading edge of the technology. It's also important for him to be as accurate as he can about this because he has to put his money where his mouth is. He has to sign the right amount of compute otherwise he screws himself. He got it wrong in the opposite direction that you're implying, so at this point it sounds like you are more interested in your axe to grind than the truth on the ground.
You think enterprises are adopting CC because they think "oh this will replace my SWEs I can fire them"? That's not happening at major companies. They buy CC because it's useful and the writing is so clearly on the wall in so many data points that to suggest otherwise is a bit silly at this point.
> For my companies next quarter, I'll just pull some random numbers out of my ass so we can make plans with material impact to company business based on that.
You, as a leader of a company, don't have to make predictions? Don't have to make bets about what the best thing for you to do next year? That must be incredibly nice.
Amodei and everyone else need to plan compute and plan their products and roadmap. You want him to....not do that?
To the stunning tune of 5B in the lifetime .
> You think enterprises are adopting CC because they think "oh this will replace my SWEs I can fire them"?
Yeah, that's actually Darios main talking point
> They buy CC because it's useful and the writing is so clearly on the wall in so many data points that to suggest otherwise is a bit silly at this point
Right, really sound arguments - writing is "clearly on the wall" and there are "so many data points". I'd be keen to use those immediately, but I am kind of missing the key of the "many data points" - namely, what did you build with it and how much ARR is it generating?
> You, as a leader of a company, don't have to make predictions
I have to make predictions, but not confabulations, lies and idiocies.
> Amodei and everyone else need to plan compute
FOR WHAT? Again, what was built with their shitty product in various companies and how much ARR did it generate? Uber seems to get no value out of it.
> Right, really sound arguments - writing is "clearly on the wall" and there are "so many data points".
Thank you for recognizing this. Don’t read Ed and think you understand anything about AI is all I’ll say. Read epoch capability index paper and look at the dashboard chart or the METR time horizon chart and methodology and then return with what I imagine from historical comments will be another ferocious and impressive act of mental gymnastics.
> I have to make predictions, but not confabulations, lies and idiocies.
Idk you’ve been misquoting and aggressively against addressing any facts you are presented with and yet bring no facts of your own (hint: if you know what you’re talking about typically you can calmly discuss with actual facts). That feels pretty similar to confabulations, I won’t say idiocy I’m sure you are not an idiot but you seem to have a lot in common with your caricatures of tech CEOs.
> FOR WHAT?
Their product.
A sworn affidavit by the Anthropic CFO from Dec. 2025 is what you need to look up mate.
So, he's closer to correct than not.
That said, your recollection is also flawed. It was in mid-March, and here's the relevant quotes:
>I think we’ll be there in three to six months—where AI is writing 90 percent of the code. And then in twelve months, we may be in a world where AI is writing essentially all of the code.
[...]
>But the programmer still needs to specify, you know, what are—what are the conditions of what you’re doing, what—you know, what is the overall app you’re trying to make, what’s the overall design decision? How do we collaborate with other code that’s been written? You know, how do we have some common sense on whether this is a secure design or an insecure design?
[...]
>So as long as there are these small pieces that a programmer, a human programmer, needs to do, the AI isn’t good at, I think human productivity will actually be enhanced. But on the other hand, I think that eventually all those little islands will get picked off by AI systems.
With another 3-4 months left on the clock, his prediction seems remarkably on point for at least certain organizations and domains.
I welcome you to also hold yourself accountable in the coming months if this trend continues. ;)
That probably explains why their uptime and reliability are so bad.
I agree that most of the things are written by AI but writting code was never the bottleneck in big tech.
That said, I generally agree that you're correct: writing code in many ways has not been the biggest bottleneck. However, by removing much of that writing, it frees up engineers to work on the uniquely human things that are larger bottlenecks.
I had a few comments in a thread here touching on where I think most of the value has come from for us (which is largely search/understanding of our dependencies and making away team work far more viable, which aids with cutting through bureaucracy and the tendency for teams to push back on work): https://news.ycombinator.com/item?id=48298731
My company did not swallow hundreds of billions in shady investment deals and is not publicly traded. We work with real money, and the revenue on our books is the revenue that is actually booked, not fake revenue we plan in 2 years time to maybe happen. So no, I am not going to hold myself accountable. But people who work with other people's money should be absolutely held accountable when their wild imaginations don't come true, repeatedly, quarter after quarter, year after year!
You criticized a very specific (and fake/misquoted) prediction, ignored the correction, and are now criticizing vague hand-wavey "predictions" that you have left unspecified.
Can you please stop with the angry/ranty replies and actually have a real conversation grounded in actual facts?
Now, having said all of the above...I'll also point out that these are predictions, not promises/guarantees. These people are being asked to forecast and are doing so. I hardly think they should be held responsible for not being literal oracles, but even so--please, at least quote them correctly/at all.
In short: be better than the hallucinations you're seen to call out from the models.
Like, I understand the reasonable arguments against (I even agree with a few), but it's clear that some people have fully inserted their head into the sand and just don't want to believe any of this could be true. Which will be harsh, since I think getting hit with this train all at once in the future is going to be a rougher ride than a slower coming-to-terms-with, even if the result is one we're unhappy with.
So, unsourced vibes from a shady guy whose entire empire is built on being against AI?
I genuinely don't know how folks can continuously buy into anything he has to say after that Wired piece. The credibility there is seriously lacking.
Please, continue to be skeptical of the labs. But people need to stop talking about this dude as if he's the Holy Grail of the anti-AI movement. It's going to blow up in y'alls faces.
I think it's telling that most critics don't address his actual points, but instead his credibility because he's a "hater".
Actually he provides sources when he analyses stuff and imho much better than the usual corporate "Sam Altman says we should ask ChatGPT how to raise babies" crap. Also, I don't know many 'shady' guys who have built entire "empires", nor does he seem to actually have an empire. Usually being shady means you are kind of unknown and all. I am not glorifying Ed, don't even know him personally. I am not even impressed with his writing style much to be honest. But he brings important facts and information to light, which otherwise would have been lost in the cacophony of corporate media light treatment of these con-men. Holy Grail? Blowing up in our faces? WTF are you talking about?
You said it was likely an internal leak to the WSJ "according to Ed Zitron". Did Ed have a source for that, or was it just vibes?
^ Apologies, the above read to me like you were saying that Ed himself was claiming that Anthropic leaked to the WSJ.
Agreed. But its only a great deal because it is heavily subsidized, as you said yourself. Enjoy while it lasts, but in my book, product-market fit means something along the lines of "product which enjoys a loyal customer base, sold at a price perceived fair by the customers, and generating profit. How many of these does your definition of product-market fit hit here?