10 pointsby armcat7 hours ago1 comment
  • _frenzal_rhomb_7 hours ago
    It's insane that 9b model beats gpt-oss-120b in GPQA, is highly competitive on instruction following, and beats out gpt-oss-20b in maths.
    • observationist5 hours ago
      Hopefully it's not just benchmark maxxing - models are getting small enough to run on phones and standard consumer laptops. Things like AirLLM and other tricks allow for much larger models to be run at slower speeds, too. You might use one of these small models to drive an agent, and escalate to slower, more powerful models run locally when required.

      Phenomenal that they're releasing the base models as well as the tuned ones. The US really needs to step up the OSAI game, we're getting utterly trounced.