2 pointsby wonderwhyer7 hours ago1 comment
  • wonderwhyer7 hours ago
    The debate around local vs API vs subscriptions feels mostly anecdotal. I tried building a tool that compares them using “quality-adjusted tokens per dollar.”

    The idea:

    Tokens per dollar

    Weighted input/output pricing (75/25 assumption)

    Benchmark-normalized quality (Arena, Aider, SWE-bench)

    Early results surprised me (local often loses economically unless privacy is heavily valued).

    I’m mostly looking for critique of the methodology:

    Is quality-adjusted tokens per dollar even the right metric?

    Is normalizing ELO to % defensible?

    What benchmarks am I missing?