Troy.
I skimmed through the outline. Will take a look at the individual videos when I'm on PC.
But I have been through this cost-saving phase. I didn't see "prompt distillation" as one of the techniques in your outline.
The idea is to reduce your fixed prompt token size such as "system prompt" by removing semantic words completely and other methods. I saw a whopping 60% decrease in my fixed prompt token budget.
Pls note, my scale is small but the technique works nonetheless.
Edit: so, have you tried it? Or if you have tried it, how did it go?