That's what I'm building: treat agent-to-LLM calls like microservice calls. Detect loops, enforce budgets, fail-open if the firewall itself crashes. Same patterns, new domain.
The interesting part is loop detection needs to be semantic, not just frequency-based. Two agents can politely argue in circles without ever triggering a rate limit.