I don't understand how this is possible at all at Anthropic. Couldn't they, like, embed an agentic swarm into their backend that prevents any errors from ever making it into production? What am I missing?
I've spent the last couple days building out an automated classifier on top of the batch API, and just this morning (about a minute before the outage began!) started running my first live tests. I thought I was going mad!
Well if you look at the status page [0], it only seems to have become a daily occurrence in the past two weeks. Even before that uptime wasn't so good though.