2 pointsby akarshc8 hours ago1 comment
  • akarshc8 hours ago
    I’ve worked on and shipped a few AI systems that reached real users.

    This post isn’t about models or prompts. It’s about the things that kept breaking once AI moved off the happy path: async jobs, retries, silent failures, provider outages, cost blowups, and debugging without visibility.

    I wrote this mostly as a way to document the mistakes I made and what I wish I had known earlier. Happy to answer questions or dig deeper into any of the failure modes.