Hacker News
new
top
best
ask
show
job
Why averaging LLM benchmark scores is fundamentally broken
(
arxiv.org
)
1 point
by
testofschool
4 hours ago
1 comment
testofschool
4 hours ago
[flagged]