> Post-training methods allow teams to refine model behavior for specific tasks and environments.
How do you suppose this works? They say "pretraining" but I'm certain that the amount of clean data available in proper dataset format is not nearly enough to make a "foundation model". Do you suppose what they are calling "pretraining" is actually SFT and then "post-training" is ... more SFT?
There's no way they mean "start from scratch". Maybe they do something like generate a heckin bunch of synthetic data seeded from company data using one of their SOA models -- which is basically equivalent to low resolution distillation, I would imagine. Hmm.
But seriously, RAG/retrieval is thriving. It'll be part of the mix alongside long context, reranking, and tool-based context assembly for the forseeable future.
So it'd be alive in the making decisions sense, not in a "the technology is thriving" sense.
It's feasible for small models but, I thought small models were not reliable for factual information?
Foundational:
- Pretraining - Mid/post-training (SFT) - RLHF or alignment post-training (RL)
And sometimes...
- Some more customer-specific fine-tuning.
Note that any supervised fine-tuning following the Pretraining stage is just swapping the dataset and maybe tweaking some of the optimiser settings. Presumably they're talking about this kind of pre-RL fine-tuning instead of post-RL fine-tuning, and not about swapping out the Pretraining stage entirely.
Is it possible to retrain daily or hourly as info changes?