20 pointsby fryz16 hours ago11 comments
  • serguei16 hours ago
    We've been ramping up our gen ai usage for the last ~month at Upsolve and it's becoming a huge pain. There are already a million solutions for observability out there, but I like that this one is open source and can detect hallucinations

    Thanks for open sourcing and sharing, excited to try this out!!

    • fryz11 hours ago
      Yeah thanks for the feedback.

      We think we stand out from our competitors in the space because we built first for the enterprise case, with consideration for things like data governance, acceptable use, and data privacy and information security that can be deployed in managed easily and reliably in customer-managed environments.

      A lot of the products today have similar evaluations and metrics, but they either offer a SAAS solution or require some onerous integration into your application stack.

      Because we started w/ the enterprise first, our goal was to get to value as quickly and as easily as possible (to avoid shoulder-surfing over zoom calls because we don't have access to the service), and think this plays out well with our product.

  • Gabriel_h16 hours ago
    Interesting, AI needs much better guardrails and monitoring!
  • kacperek016 hours ago
    Cool, I'm running few GenAI automations, but they're rather unsupervisored. So I'm gonna try it and check how they're doing.
  • madeleinelane13 hours ago
    Love this. More transparency + better tooling is exactly what AI needs right now. Excited to give it a try.
  • iabouhashish15 hours ago
    Very excited to be trying this out! The examples look very useful and excited to tie it up with other open source solutions
  • Lupita___16 hours ago
    Thanks for sharing! This looks perfect for teams getting started with monitoring for all model types -- excited to try it out!
  • pierniki16 hours ago
    Yoo! Hopefully no more "oops our AI just leaked the system prompt" moments thanks to these guardrails!
  • jdbtech14 hours ago
    Looks great! How does the system detect hallucinations?
    • fryz11 hours ago
      Yeah great question

      We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.)

      We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context.

    • iabouhashish14 hours ago
      [dead]
  • vparekh199516 hours ago
    Excited to get hands on with this. I've had too many sleepless nights trying to figure out how to track when my agents were hallucinating.
  • cipherchain11116 hours ago
    Very cool!
  • saintjcob15 hours ago
    [dead]