Being able to super-power apps with on-device models is a lot of fun. I recently did the same building my own dictation app using small local models, and I still can't believe how effective it is. The download is just 20mb, though it will download parakeet ~475mb for audio, but can use the on-device model as the second-pass LLM and works pretty well (though better models are available to download and use e.g. Llama 3.2 4bit and Qwen 2.5 7B 4bit)
I'm currently building a little tool for a professional photographer friend to go through and classify images in their photoshoots, so I can build a searchable db for them to quickly find very specific images in the future. I simply don't think it would have been possible for me to build a tool like that just a couple years ago at any price.