TokenShrink v2.0 – Reddit proved our token math was wrong, so we fixed it(tokenshrink.com)

Hey HN – I built TokenShrink, an npm package that compresses AI prompts to save tokens. Zero dependencies, runs locally in <1ms.

  After posting v1.0, r/LocalLLaMA tore it apart. They were right.

  The problem: v1.0 estimated tokens as `words × 1.3`. But BPE tokenizers don't work that way. "database" is 1 token.
  "db" is also 1 token. That replacement saves exactly nothing. Worse — "should" → "shd" goes from 1 token to 2. We were
   making prompts MORE expensive.

  What v2.0 does differently:

  - Precomputed every dictionary entry against cl100k_base (GPT-4's tokenizer)
  - Removed 130 entries that saved zero tokens
  - Removed 45 entries that actually increased token count
  - Replaced the word heuristic with a real token cost lookup table
  - Added pluggable tokenizer support: `compress(text, { tokenizer })`

  What it still does well — phrase compression. "In order to" → "to" saves 2 tokens. "Due to the fact that" → "because"
  saves 4. "It is important to" → removed entirely. These multi-word filler phrases are where the real savings are.

  Benchmarks (verified with gpt-tokenizer):

    Verbose dev prompt:    408 → 349 tokens (14.5%)
    Code review prompt:    210 → 183 tokens (12.9%)
    Medical notes:         151 → 134 tokens (11.3%)
    Business requirements: 143 → 121 tokens (15.4%)
    Minimal filler:         77 →  77 tokens (0.0%)

  No prompt had its token count increase. Zero false savings.

  npm: npm install tokenshrink
  Web: https://tokenshrink.com
  GitHub: https://github.com/chatde/tokenshrink