5 pointsby kavin_key4 hours ago1 comment
  • kavin_key4 hours ago
    Hey HN, Kavin, co-founder of Trainly. We built observability for AI agents, and the hardest part of selling it has been getting people to believe they have a problem. "My agent works fine" is the universal answer, right up until you actually look at the traces. So we're giving the diagnostic part away. Drop in our SDK with a one-line @observe decorator (or prompt Claude code to), let it run for 72 hours, and we'll send back a report on what we found: silent tool failures, retry loops, latency and cost outliers, error patterns by input shape, and any obvious behavioral weirdness in the data.

    A few technical notes:

    The audit itself is 100% heuristic, no LLM calls on our end, just queries over the traces you send us. Your prompts aren't ending up in anyone's context window by accident. Trace cap is 10k per audit. After that the API key auto-disables at the auth layer; the SDK stops tracing silently so it won't break your app.

    The paid product layers unsupervised semantic anomaly detection on top, HDBSCAN over joint embeddings + behavioral features, UMAP for visualization, LLM-generated cluster summaries. The audit is the heuristic pass; it'll surface the obvious stuff, not the subtle drift.

    First 50 audits are free. Email + SDK install, no call required.

    Context: we're pre-first-customer. Part of why I'm doing this is I want real agent traces to stress-test the anomaly detection against, and part is that cold outreach is slow and I'd rather have 20 of you stress test the product than spend another month in LinkedIn DMs. If the report is useless, tell me why. Link: https://trainlyai.com/audit Happy to answer anything, infra, pricing, why not just use Braintrust, whatever.