1 pointby digitalshepard5 hours ago1 comment
  • digitalshepard5 hours ago
    Hey HN. I built this because I had a $47 Tuesday.

    One Claude Code session, eight hours, no visibility into what was happening. By the time I checked the billing page the next morning, the damage was done.

    The problem: AI coding CLIs (Claude Code, Codex, Gemini CLI) don't ship with observability. You get a chat interface and a monthly bill. No metrics, no traces, no alerts.

    The solution: shell hooks (bash + curl + jq) that emit native OTLP to a standard OTel Collector → Prometheus + Loki + Tempo → Grafana. Six containers, docker-compose up, eight dashboards auto-provisioned.

    What it tracks: - Cost in real-time (USD, not just tokens) - Tool calls per agent (which tools, how often, error rates) - Session timelines reconstructed as Tempo traces (the Claude parser is 265 lines of pure jq) - 15 alert rules — HighSessionCost fires at $10/hr, SensitiveFileAccess fires when an agent touches your .env

    The hooks are ~30 lines of bash each. No SDK, no Python wrapper, no runtime dependency beyond curl and jq. Each hook receives JSON from the CLI, builds an OTLP payload, and fires it at localhost:4318. The & disown at the end is the entire performance strategy.

    Codex doesn't emit native metrics, only logs — so there are 15 Loki recording rules that extract structured fields and remote-write to Prometheus. The Codex dashboard queries PromQL as if metrics were native.

    Elastic-2.0 licensed (free to use, can't offer as hosted service).

    Longer write-up with architecture details: https://digitalshepard.ai/articles/the-eye-part2/