These things are so tricky because everyone has a seemingly conflicting experience. Part of the fun I guess!
Its very easy to burn through your quota if you work like that. Especially on high / xhigh.
I use Codex when Claude Code is down, and I only began using Claude when ChatGPT was down
yes codex is very fast, I go back to Claude for now
Opus 4.7 + Rust is a killer combo.
Heck I prefer DeepSeek to both of those.
I was running deepseek through claude's code agent harness. Maybe it works better through a different tool?
Harness also matters, and also provider. I was using openrouter and switched to the Deepseek api and suddenly all the tool call issues I was having resolved themselves. Flash is so damn fast at doing stuff like generating boilerplate I can’t go back to the bigger slower models.
That and the lack of image-read support surprised me. I'm a big fan of feeding screenshots into my llm and that killed it for me.
I would have been much more impressed with v4 about 6 months ago. But I've been spoiled by opus 4.7. Deepseek isn't at the same level.
My systems are hitting exponential delay retries, so this might not get better because retries overload things again.
> {'type': 'error', 'error': {'details': None, 'type': 'overloaded_error', 'message': 'Overloaded'}, 'request_id': 'req_ ...
I can see a weird spike in my cache hit-rate a few minutes before, so this might actually be some extra caching they have thrown in.
Dario and co seem to be on some elevated pedestal - us mere mortals are beneath them - and they have this scattershot devrel where each engineer has their own X way of communicating to the public often at odds with each other.
I loved Sonnet and Opus fwiw but not anymore.