I've been using VoiceInk on macOS for a few months now. The workflow is just: hold shortcut, speak, release, text appears at cursor and works in terminal, editor, chat, wherever.
The post covers Handy, Whispering, VoiceInk, OpenWhispr, and FluidVoice. All open-source, all do local transcription, all paste directly into the active window. The differences are mostly platform support, model selection, and how much extra stuff (AI post-processing, voice-activated mode, etc.) they add.
Happy to answer questions about any of these or about the voice-typing-for-agents workflow in general.