11 pointsby tosh7 hours ago2 comments
  • 2001zhaozhao2 hours ago
    It would be interesting to see if they have an updated version of a model that employs this training technique. According to the paper it scored well on release (65.8% on SWE bench), but by now it no longer scores competitively against the latest generation open coding models (e.g. Devstral Small 2).

    I wonder whether other labs have implemented something similar to this approach. Perhaps code world modeling isn't actually necessary (relative to other simpler techniques) to achieve the kind of deep environment understanding that the paper touts as being important to improve agentic coding performance.

    • general_revealan hour ago
      Serious question. How do we know these bench suites are any good?
  • chid2 hours ago
    Given the high bar of entry 160VRAM GPU - is there anything practical one can use this for?