My own impression based on inference prices for deepseek or other "open" models in the 1T range (including providers like DeepInfra with no obvious reason to subsidize their API costs) is that Anthropic is offering subscriptions at cost (on average, power users are a bit more expensive, casual users more profitable) and making good profit on API pricing. Profit that then is spent on model training, marketing and development, for an overall negative bottom line
Edit: in case it gets changed: current headine is "Anthropic/OpenAI may be spending more than $1000 for every $100 you pay them"
> Methodology & assumptions: No caching
This is absolutely absurd. Claude code is of course using the cache (and this can be verified by looking at the traffic). It would be an incredibly stupid design to resend the whole input without a cache for every input, every tool use, etc..
GPUs that can run everything from Crysis to CUDA are a harder engineering problem to solve than creating a chip that's optimized for inference. Not to mention that inference is an excellent first step towards a full, competitive GPU as well.
And it seems all of these advanced chips rely on the most advanced lithography which is tightly guarded and supply locked by a few companies.
1. That the API pricing is required to make a profit, rather than being effective market segmentation to make a larger profit.
2. That if subscriptions are loss making, it is not worth having loss leaders.
Our code base is not small, millions of lines of code. It does not take $65 in tokens to solve an issue. I'm running 3-4 claude code terminals at the same time and i'm still pretty close to what it would cost per a token for usage. I don't know what we are doing right with our code or claude.md to make this happen and I don't want to change it to break it.
Someone or something is having an hallucination that would make an AI jealous
banks used to lose a lot of money on those toasters, amazing they are still in business
If it is subsidised, fine – the incumbents absorb the losses, or lean on hyperscalers like Google or Microsoft who can cross-subsidise across other revenue streams. But if it isn’t, that’s the worse outcome for them: inference is just cheap, competition kicks in, prices crater, end users win.
Either way, local models win. If the incumbents are forced to turn a profit, pricing goes up and as local compute gets good enough to handle most use cases, people flock to it. And if inference is just cheap, that means the compute requirements are lower than we thought, and local hardware gets there even faster.
Bullish on local either way. We’ll find out once the Anthropic S-1 drops.
Input: $0.257
Output: $1.286
Cache read: $0.0257
Cache write: $0.322
And I have zero doubts that using batching and other optimizations that subscription users are being served at an even lower cost. Most of their expenses likely come from training as we're far into the diminishing returns terriority. We will know once anthropic is required by law to report these numbers so there's no point in continued speculation that "anthropic is losing $9 for every $1" because 1: unless there's some subsidies going on it's not true and 2: we will be told directly from anthropic what the numbers are in the near future.Also abuse of free accounts/trials wouldn't work since it would destroy cache and it maintains 97% cache rate.
Companies may state cash flow positive, operating profit, EBITDA positive, but this is not a true profit in aggregate, just when stripping out many other expenses.
If anyone has evidence to the contrary, please share. Once they go public it will all be free to review, at least.
Free: https://www.cnbc.com/2026/05/20/anthropic-revenue-explosive-...
Paywall: https://www.wsj.com/tech/ai/mind-blowing-growth-is-about-to-...
Actual article:
> Mind-Blowing Growth Is About to Propel Anthropic Into Its First Profitable Quarter
Condition:
> The startup expects a 130% revenue surge to $10.9 billion in the June quarter and its first operating profit, defying skeptics of the AI boom
Ah yes, if revenue grows by 130% and expenses don't they might make a profit of $500M on $11B revenue.
I wish people actually bothered to at least read the titles.
There are lots of knobs to dial for your costs.
I’d say we are at the peak of inflated expectations, exactly where they want things to be for the IPOs as they unload their bags on the public. There will be an absolutely massive crash that will destroy the stock market, you’ll finally be able to buy ram and a hard drive again, and then AI will really take your job in the plateau of productivity that comes much later.
The true cost of AI won't be revealed until after a large portion of the customer base has become "hooked" on it.
I’m happily paying $200/mo for Claude code. The tokens I use would be >$10k at API rates. I’m building the best products of my life, and multiple of them in parallel. I’m doing better creative work, finally realizing a game idea I’ve had no time for, etc.
If this level of usage goes to $500/mo… I’ll be out. It’s worth what I’m paying, but hey I went a decade without writing that song or building that game. It’s not freaking heroin, it’s a tool that offers a good value for what I pay.
Could be wrong though.
That combined with the relatively easy switching costs. It doesn't bode well for AI companies seeking to create a walled garden.
* Hardware improvements will reduce costs
* Model training improvements (read: more efficient model training) will reduce costs
* Better models will reduce costs (more inference for less hardware time while keeping quality constant)
* Tooling and platform will stabilize—less need to dump money into applications and backend systems because they will become mature—also improvements in AI efficiency and quality will lower the cost of maintenance and future feature development
* Energy buildout will stabilize (we will eventually have enough energy supply to meet AI demand)
* Chips market will stabilize (chip supply will catch up to AI demand, lowering the hardware costs)
1. The GitHub Copilot pricing change that started a week ago already change buying decisions.
2. Small name open weight providers selling at what I assume, and hear through grape vine, is a profitable place.
Claude is over priced for what you get, and if headline is true, expensive to run. I do wonder if their API pricing is profitable. That's the word on the street about Big AI, they are making money on the PayGo
Downside: you need to buy a new one for each model.
Upside: insanely fast inference and zero subscription cost, only one time purchase cost.
Once a certain open source model gets good enough this might become viable.
Right now the landscape is still shifting too fast.
State of the art models might remain on subscription, expensive and might be used by large companies only.
State of the art companies might also create their own hardware with hard-baked weights on chip that they don't release to the public, as it might just make more financial sense long term once they "stabilize" on a certain model.
Having lightning-speed, local inference of a super high-quality model would be incredible. If you haven't played with it, check out Taalas's demo [1].
Honestly, though - I have my doubts. Recurring revenue is just too nice to pass up; I'm sure AI companies wouldn't want me buying a dedicated Opus card and not giving them money for several years until there's something worth upgrading to.
Of course expecting the metaphorical Harvard Business School analysts to realize that is asking a lot. Subscriptions are Good and Goodness is Subscriptions, and like any other mass of people following trends the preconditions on when subscriptions are good for a business tend to get lost in the frenzy.
That would fuse 3D graphics and AI accelerators into 1 and the same unit, as far as consumer hardware is concerned.
the world has long since moved on from that business model, unfortunately.
Everything we’re using now is the equivalent of building a GPU on an FPGA: the hardware is general purpose at one abstraction level, and that comes with inefficiency at the next layer up. Collapse the levels, gain efficiency at the cost of generality.
To answer my own question, I bet they could figure out a way to still bill you per-token, if they wanted to.
And of course they could bill per-token, same way cable PPV worked (the bits were already in your house). But the cost structure of weights in silicon means that competitors would be encouraged to compete on this per-token cost, as their marginal cost would be zero.
I don’t see that being a durable business model, but I guess the counter argument is it’s also similar to game consoles, where initial hardware is subsidized and the business model assumes ongoing payment for bits.
I find it disingenuous when people narrow in and focus on the cost of tokens as if thats the only way the companies make money.
They are doing a massive data grab and stealing and thieving your IP and data, non-training data sharing cannot be opted out of.
Meeting next week at the White House by coincidence just before the SpaceX IPO. Message to investors will be dont worry the US has your back...
At which point the corruption is sooo big, that an Empire crumbles under its own stench?
"Trump to meet AI leaders to discuss US investment in their companies" - https://www.bbc.com/news/articles/c98r8r7dz5no
"Trump Officials Held Millions of Dollars of SpaceX Ahead of IPO" - https://finance.yahoo.com/markets/stocks/articles/trump-offi...
I just hope local llms keep getting better and ways to make them run faster on consumer devices improves