on a related note, at what point are people going to get tired of waiting 20s for an llm to answer their questions? i wish it were more common for smaller models to be used when sufficient.
https://www.nvidia.com/en-us/ai/nim-for-manufacturing/
Word on the street is the project has yielded largely unimpressive results compared to its potential, but NV is still investing in an attempt to further raise the GPU saturation waterline.
p.s. This project logo stood out to me at presenting the Llama releasing some "steam" with gusto. I wonder if that was intentional? Sorry for the immature take but stopping the scatological jokes is tough.
I've been trying to actually finetune Deepseek (not distills) and there are few options
I found this link more useful.
"LLaMA Factory is an easy-to-use and efficient platform for training and fine-tuning large language models. With LLaMA Factory, you can fine-tune hundreds of pre-trained models locally without writing any code."
Always curious to see what other ai enthusiasts are running!