Hacker News
new
top
best
ask
show
job
LLM INQUISITOR: Evaluating how AI models handle long, realistic tasks
(
github.com
)
1 point
by
ballista2026
4 hours ago
1 comment
ballista2026
4 hours ago
[dead]