Show HN: Codex context bloat? 87% avg reduction on SWE-bench Verified traces(www.npmjs.com)

4 pointsby george_ciobanu5 hours ago1 comment

camelliaPTM5 hours ago
The prompt went from 44k to 6k tokens, but you're making two extra model calls per round to get there (chunker + working_memory_update). What does the all-in cost comparison actually look like?
- george_ciobanu5 hours ago
  the proxy uses a cheap, small model (like gpt-5.4-mini by default) behind the scenes to save tokens on the expensive main model.
  Because the proxy has a little bit of overhead per turn, the break-even point depends entirely on session length.
  Short sessions (e.g., 2 rounds): The proxy's overhead might actually cost you more than you save.
  Long sessions (e.g., 69 to 190 rounds): The token savings on the main model are massive and completely dwarf the small model's overhead.
  It's not a universal win for quick, one-off queries, but the math becomes highly favorable on long, complex debugging sessions.