I think a major AGI checkpoint will ultimately be when an agent is able to learn to play and beat any video game in the same amount of time as a regular human without cheating by looking at memory, while also being able to explain itself and help teach others. One of the major gaps at the moment is that models don't really understand what they're doing.
Ideally you could hook up an agent to a desktop with emulated peripherals and it'll learn how to play and ultimately climb to the top ranks of something multiplayer / competitive like League of Legends without getting caught as a bot / cheater, in less than one or two thousand hours of gameplay. But instead of grinding endless games like a moron, it should be coachable and capable of reviewing replays and studying YouTube videos.