Hacker News
new
top
best
ask
show
job
SWE-Bench Failures: When Coding Agents Spiral into 693 Lines of Hallucinations
(
www.surgehq.ai
)
22 points
by
landonxi
5 months ago
1 comment
egillie
5 months ago
Is this because GPT-5 hallucinates less in general?