4 pointsby divyaprakash2 hours ago1 comment
  • divyaprakash2 hours ago
    I built this because I was tired of "AI tools" that were just wrappers around expensive APIs with high latency. As a developer who lives in the terminal (Arch/Nushell), I wanted something that felt like a CLI tool and respected my hardware.

    The Tech:

        GPU Heavy: It uses decord and PyTorch for scene analysis. I’m calculating action density and spectral flux locally to find hooks before hitting an LLM.
    
        Local Audio: I’m using ChatterBox locally for TTS to avoid recurring costs and privacy leaks.
    
        Rendering: Final assembly is offloaded to NVENC.
    
    Looking for Collaborators: I’m currently looking for PRs specifically around:

        Intelligent Auto-Zoom: Using YOLO/RT-DETR to follow the action in a 9:16 crop.
    
        Voice Engine Upgrades: Moving toward ChatterBoxTurbo or NVIDIA's latest TTS.
    
    It's fully dockerized, and also has a makefile. Would love some feedback on the pipeline architecture!