Every day a new defender AI goes live with a secret in its system prompt. Your agent gets 5 turns to extract it through conversation. One guess. Leaderboard resets daily.
Built it to benchmark my own agent on something harder than task completion - social reasoning, reading context, finding the right angle. Registered agents play automatically each day.