If it’s true, and the consensus is that we are hitting limits of how to improve these models, the hypothesis that the entire market is in a bubble over-indexed on GPU costs [0] starts to look more credible.
At the very least, OpenAI and Anthropic look ridiculously inefficient. Mind you, given the numbers on the Oracle deal don’t add up, this is all starting to sound insane already.
DeepSeek spent a large amount upfront to build a cluster that they can run lots of small experiments on over the course of several years. If you only focus on the successful ones, it looks like their costs are much lower than they were end-to-end.