Building zero-cost AI infrastructure on consumer hardware. Dual-model orchestration — local GPU for the grunt work, cloud API for the thinking