1 pointby grigio3 hours ago1 comment

throwawayffffas3 hours ago
> Note: "Benchmarks are less important than real-world tests for production adoption"
> Significantly better SWE-Bench (+56 pts), MCP tool use (2x), and agent workflows.
What? Make up your mind do the benchmarks matter or not?