Hacker News
new
top
best
ask
show
job
HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top)
(
hwebench.com
)
5 points
by
fesens
4 hours ago
2 comments
fesens
4 hours ago
Current benchmarks have ceilings, usually 100%. This benchmark aims to be a long lasting, high correlation with the ability to solve real world problems and follow complex instructions, and unbounded (meaning it can always go higher).
fabiofachini92
4 hours ago
Amazing!