Hacker News
new
top
best
ask
show
job
Quantifying LLM Cost Savings from Cache-Aware Inference Routing
(
www.auriko.ai
)
4 points
by
zxy-action
3 hours ago
1 comment
zxy-action
3 hours ago
I’m the founder of Auriko. We ran this study to measure how much cache-aware llm routing can reduce inference costs.
Critique welcome.