Here's the code - https://github.com/prajwal-y/video_explainer Some cool features that I added:
- Automated script/narration generation from the source material (using your favorite LLM). - Fully automated video generation from the script using remotion. Natural language editing of videos. Just use a CLI tool to give feedback, and the system goes back and fixes the video (using Claude Code in headless mode internally LOL). - TTS built in, but also easy way to bring your own voiceovers and sync to the video scenes automatically (using Whisper transcription). - AI background music and sound effects. - A CLI tool to interact with various components in the pipeline.
The first video I generated using this pipeline (an explainer of AI inference optimizations) - https://www.youtube.com/watch?v=SyFcoaIVad4 Everything in the video was automatically generated by the system, including the script, narration, audio effects and the background music (all code in the repository). I however did the voiceover as the TTS was too robotic (although the system generated the script for me to read haha).
I'm absolutely mind blown that something like this can be built in a span of 3 days. I've been a professional software engineer for almost 10 years, and building something like this would've likely taken me months without AI. I truly believe now that our profession is going to change dramatically in the next few years.