I built this after dealing with agent reliability issues in production. LLM APIs fail in weird ways, tools return garbage, streams cut mid-response. Traditional chaos tools don't understand these failure modes.
`agent-chaos` lets you define conversation scenarios, inject chaos (rate limits, tool errors, corrupt data), and assert behavior. Integrates with DeepEval for LLM-as-judge assertions.
Feedback welcome.