It is really beautiful to see distilled to the core algorithm. Despite AK’s claim that everything else is simply optimization, I am sure there’s a bit more to training a useful frontier model! Still, an excellent teaching tool and nice way to spend an afternoon walking through the code.