granzymes3 hours ago
Beating Opus 4.6 and coming within striking distance of gpt-5.4 is impressive! Particularly given larger labs like Meta are struggling to catch up to OpenAI/Anthropic.
More competition among model vendors is great for developers!
ManuelSuarez3 hours ago
Cursor is in a very tough situation right now. They don't have SOTA models (see the lack of benchmarks in the release), and they likely cannot subsidize usage through cheap subscriptions like claude code and openai do.
I wonder what's their plan moving forward, they have been releasing a ton of random features lalely.
- leerob3 hours ago
  Are there other coding benchmarks we should include next time? We included Teminal-Bench 2.0 and SWE-bench Mulitilingual.
  We don't plan on reporting SWE-bench Verified, for similar reasons to OpenAI: https://openai.com/index/why-we-no-longer-evaluate-swe-bench...
- merlindruan hour ago
  ...you're looking at their plan
tomasz-tomczyk3 hours ago
Just when I increased my subscription with CC for more Opus 4.6 usage :)