Hacker News
new
top
best
ask
show
job

Agent-evals: Metacognitive scoring and boundary testing for LLM coding agents(thinkwright.ai)

2 pointsby oceanwaves12 hours ago1 comment

12 hours ago
undefined

Guidelines
FAQ
Lists
API
Security
Legal
Apply to YC
Contact

Search: