I was tired of the ‘SaaS sprawl’ in my workflow - paying for 5 different AI subscriptions just to get one project done. I built SnapRookies as a unified orchestration layer that routes tasks to specialized models based on the specific creative intent.
The Architecture:
Rather than a simple wrapper, we use an ensemble approach. For example, our 'Professional Headshot' pipeline isn't just one prompt; it’s a multi-stage workflow using Flux.1 dev for base generation, a custom-trained LoRA for consistency, and a targeted GFPGAN pass for facial restoration.
The Stack:
Model Orchestration: Python/FastAPI backend managing asynchronous requests across Replicate, Modal, and custom GPU clusters.
Video Pipeline: Native integration with Kling 2.6 and CogVideoX for temporal consistency.
Frontend: Next.js 15, optimized for heavy multi-modal asset management.
We’re currently supporting 27 different 'verticals' (from UGC ads to LinkedIn headshots). I’m curious to hear your thoughts on the latency and the output quality compared to the 'big tech' proprietary black-boxes.