I built this because I kept running into a hard limit with existing meeting tools: I couldn't use them for NDA-covered calls or internal discussions, since audio and transcripts had to be uploaded to third-party servers. On top of that, juggling multiple call apps made built-in summarization hard to use even when it was technically compliant.
That's why Summit takes a different approach: everything runs locally on macOS - recording, transcription, speaker identification, and summarization. Nothing leaves the machine, and there's no account or cloud backend.
The tradeoff is that it's more resource-intensive than cloud tools, and accuracy depends on the hardware you're running on. I spent a lot of time optimizing the local tool chain (e.g. smaller on-device models like Qwen) to make this practical on Apple Silicon. I tested it on a standard corporate MacBook Air with 16 GB RAM, which works well; more memory lets you run larger models, but 16 GB is enough.
I believe in local-first AI and would love feedback from people here who've thought about it:
– Is fully on-device processing something you'd personally value?
– Are there privacy or compliance use cases I'm missing?
– What would you want to inspect or control in a tool like this?
Happy to answer any technical questions.