Why AI Tokens are so Expensive [video](www.youtube.com)

3 pointsby jonbaer5 hours ago2 comments

mikewarot3 hours ago
Well, this is certainly an interesting point at which we all find ourselves. I'm personally going to have to learn what a KV cache is, how big they are, and the costs associated with storing this value, instead of merely caching it. The cost of the loop ends up quadratic because of reprocessing everything each time a new query was added. In theory, this means that after a year, there could be a billion input tokens, just if you say "thanks!" to the model at the end of a conversation, which is nuts. Explicitly storing the model state should make it a linear cost again.
I've been following Ed Zitron's reporting on AI and costs/profits. When Mike Pound starts talking about the costs at 4:12, he hints at the fact that none of us know the actual cost of handling a token. Ed's reporting hints that it may be quite a bit higher than we've been lead to suspect, it may, in fact, be far above current retail prices, subsidized as part of the "marketing" expenses on the AI companies balance sheet to try to gain market share.
It seems we're in for a "call to Jesus" moment, and a big pop in the markets as a result. This video is part of the structure that does the popping.
I think that actually storing the model state after each query/result will become the standard, to save reprocessing tokens. Switching between models would thus become discouraged because it would lose that state. It wouldn't be impossible, it just wouldn't be cheap. I could see storing the entire AI state, including "thinking", model version, etc. along with the code in the GitHub repository, right next to the commit comment. Gits delta compressor should make storing all that binary data tractable.
We're living in interesting times. Agentic coding certainly seems valuable, but it may turn out to be ruinously expensive, compared to just using us human programmers. We'll have to wait and see.
smartformulapro4 hours ago
[dead]