Hacker News
new
top
best
ask
show
job
Position: Coding Benchmarks Are Misaligned with Agentic Software Engineering
(
arxiv.org
)
1 point
by
popey
7 hours ago
1 comment
pqtr2
4 hours ago
Couldn't agree more. Coding benchmarks are just a score. Benchmark the harness.