Hacker News
new
top
best
ask
show
job
Show HN: Agent-evals – Claude skill to build your own evals
(
github.com
)
6 points
by
sauercrowd
6 hours ago
1 comment
johnjudeh
4 hours ago
Thanks for sharing! It’s way easier to build an agent that can complete a task than to make sure it works across all the cases you care about. Especially when the output quality is really subjective