Show HN: I built a proxy that cuts LLM costs 40-60% – no AI involved(agentready.cloud)

2 pointsby christalingx7 hours ago1 comment

christalingx7 hours ago
How it works under the hood (since HN will ask): No LLM call, no summarization — purely deterministic.
Strips filler words ("basically", "essentially"), collapses verbose constructions ("in order to" → "to"), removes redundant connectors. Output is always a strict subset of the original — no words added, none moved.
On privacy, since it always comes up: your OpenAI/Claude keys never leave your app. You send us text → we return compressed text → you call your LLM yourself. We don't know which model you use or what you're building.
Real numbers across 2.4M+ calls: 42% average reduction. One beta user at 50k prompts/day saves $2,100/month.
For existing codebases, one line: from agentready import patch_openai Every OpenAI call gets compressed automatically. Zero other changes.
Free during beta, no card: https://agentready.cloud/hn
AMA on how the rule engine works, the edge cases are more interesting than you'd expect.