Movies pictured robots like this long before this become possible, but how did producers guessed it?
Or maybe movies rendered different kinds of robots, but this video bring into my memory only those, that look like this. A kind of confirmation bias?
The robot and ball pose is estimated by high speed mocap cameras, and is fed to the policy.
I imagine estimating that with onboard cameras - how humans do it - is much harder.
Almost all of closed loop robotics is a state estimation problem. Control is “solved” if you can estimate state well enough.