Hacker News
new
top
best
ask
show
job
Why SWE-bench Verified no longer measures frontier coding capabilities
(
openai.com
)
2 points
by
gmays
8 hours ago
1 comment
agentica_ai
8 hours ago
[dead]