I wonder whether other labs have implemented something similar to this approach. Perhaps code world modeling isn't actually necessary (relative to other simpler techniques) to achieve the kind of deep environment understanding that the paper touts as being important to improve agentic coding performance.
There's a few pre-quantized options[0] or you can quantize it yourself with llama.cpp[1]. You can run the resulting gguf with llama.cpp `llama-cli` or `llama-server`, with LM Studio or with Ollama.