There are plenty of self-hosted models being released all the time, they just don't make it to HN. For that, you need to find a community that is passionate about testing and tinkering with self hosted models. A very popular one is "/r/localllama" on Reddit, but there are a few others scattered around.
TheRegister, SlashDot and hackaday I know of.
And my old favorite models broke so I have to link different versions. nous-hermes2-mixtral I miss your sage banter.
Now everything runs on an excessive lag.
I mostly use them for game assets.
Trellis2 is very cool. Ive managed to put together a sdxl -> trellis -> unirig pipeline to generate 3d characters with mixamo skeletons that's working pretty well.
On the llm front, deepseek and qwen are still cranking away. Qwen3 a22b instruct, imho does a better job than gemini in some cases with ocr and translation of handwritten documents.
The problem with these frontier open weight models is that running them locally is not exactly tenable. You either have to get a cloud GPU instance, or go through a provider.
- https://github.com/microsoft/TRELLIS.2 - https://github.com/VAST-AI-Research/UniRig