10 pointsby pseudolus6 hours ago2 comments

tharkun__5 hours ago
Meh, they work, sort of, sometimes.
Don't get me wrong. I've been using Claude Code and Codex CLI for quite some times now and it's amazing what they can sometimes do. (I skipped the "Copilot" phase where AI was just a "better" auto-complete)
Emphasis on sometimes. And you really have to double check everything they do. So much.
And literally this week, Claude turned "dumb". Things I'd expect it to be able to do before, result in stuff I really just throw out the window. I thought I maybe started prompting differently or something so I tried multiple times on the same task. But no, it just went nowhere this week. Codex worked fine on the same problem but it tried to cheat real bad on the test cases. Luckily I caught it but otherwise the tests would've been completely useless. Essentially "always green".
And this is on the "pay-per-token" work account, so I can't simply explain it away with "they're saving on compute for free / bulk pay".
plagiarist5 hours ago
I oscillate between worry and an overwhelming sense of job security.
- whattheheckheckan hour ago
  Right like oh now the subject matter experts can code.. To good luck debugging the mess you made