NadirClaw is a Python proxy that classifies each prompt in ~10ms and routes simple ones to Gemini Flash, Ollama, or whatever cheap model you want. Only complex prompts hit your premium API. It's OpenAI-compatible, so you just point your existing tools at it.
In practice I went from burning through my Claude quota in 2 days to having it last the full week. Costs dropped around 60%.
pip install nadirclaw
Still early. The classifier is simple (token count + pattern matching + optional embeddings). Curious what breaks first.