2 pointsby usespoke11 hours ago1 comment
  • usespoke11 hours ago
    I built Spoke because I was tired of paying $15/month for a dictation app that sends my audio to a cloud server I don't control.

    Spoke runs a 600M-parameter speech model (NVIDIA Parakeet TDT) entirely on-device — no internet required, audio never leaves your Mac. On Apple Silicon it transcribes 60 seconds of audio in ~400ms (150x realtime). Word error rate is 6.34% vs Whisper large-v3's 7.4%, at 2.6x smaller model size.

    The part I'm most proud of is the Flow builder — a visual automation engine on top of the transcription layer. Instead of just "speak → insert text", you can chain 14 node types: AI Skills (with 5 provider options including Ollama for fully local LLMs), webhooks, AppleScript, Shortcuts, conditional routing by active app, text transforms, clipboard, file saves, and more. So you can do things like: speak casually → rewrite to professional tone → insert into the active app → send a webhook log → save to a daily journal file. All triggered from a single keypress.

    A few things I deliberately did differently:

    - Native SwiftUI, not Electron. Under 50MB RAM at idle vs 500-800MB for cloud alternatives - No account required - $9.99 one-time vs $180/year competitors (50 free uses to try it) - API keys stored in macOS Keychain, not their servers - Per-app flow configuration (different behavior in VS Code vs Slack vs Mail) - Voice ID — biometric speaker verification so it only responds to you

    I'm a solo developer, shipped this about two weeks ago. It's had its first real users and I've been iterating fast based on feedback. Just shipped v1.1.0 yesterday.

    Would love honest feedback — especially from people who've tried Superwhisper, Wispr Flow, or similar tools. What did I miss? What would make you switch?

    https://usespoke.app