The main thing i see here is that prompts never touch a third-party server. If you're in a regulated industry or just don't want proprietary context hitting an API, running inference on your own hardware with encrypted p2p from any device is really cool (and useful.)
(staying in userspace via tsnet without touching kernel sockets is a nice touch too.)