4 pointsby akarshc2 hours ago5 comments
  • akarshc2 hours ago
    While building AI features that rely on real-time streaming responses, I kept running into failures that were hard to reason about once things went async.

    Requests would partially stream, providers would throttle or fail mid-stream, and retry logic ended up scattered across background jobs, webhooks, and request handlers.

    I built ModelRiver as a thin API layer that sits between an app and AI providers and centralizes streaming, retries, failover, and request-level debugging in one place.

    It’s early and opinionated, and there are tradeoffs. Happy to answer technical questions or hear how others are handling streaming reliability in production AI apps.

  • amalv2 hours ago
    At what point does adding this layer become more complex than just handling streaming failures directly in the app?
    • akarshc2 hours ago
      If streaming behavior is still product-specific and changing fast, this adds friction. It only pays off once failure handling stabilizes and starts repeating across the system.
  • 2 hours ago
    undefined
  • arxgo2 hours ago
    Why not just handle this in the application with queues and background jobs?
    • akarshc2 hours ago
      Queues work well before or after a request, but they’re awkward once a response is already streaming. This layer exists mainly to handle failures during a stream without spreading that logic across handlers, workers, and client code.
  • vishaal_0072 hours ago
    [dead]