Hey HN! I had some free time this weekend so I ran a small experiment on Claude Sonnet 4.6 with and without a simulated end_conversation tool.
This is essentially the paper in the blog because I don't have an arxiv endorsement yet haha but I found the results pretty cool