4 pointsby StanAngeloff8 hours ago1 comment
  • StanAngeloff8 hours ago
    > https://artificialanalysis.ai/agents/coding-agents?coding-ag...

    This is the full URL that does a composite average across DeepSWE, Terminal-Bench and SWE-Atlas-QnA. Models are measured in their respective harnesses.

    What is surprising to me is that Claude Code + Fable 5 (max) is on par with Codex + GPT-5.5 (xhigh)... yet Fable burnt through 1M extra tokens.