Some questions:
Instead of the discovery mechanism, could I tell it directly where other workers are? (Also, I take it that the "Quick Start" example uses only local workers?)
Is the "local only" case basically equivalent to `multiprocessing`?
Sorry, docs are lax at the moment, but the discovery subpackage has two modules, local.py and lan.py with the implementations. The discovery protocol is easily extensible - no inheritance required, just satisfy the protocol. You could implement a simple discovery publisher that publishes your worker addresses and a subscriber that knows where to look them up - the local implementation uses shared memory to register workers, for example, but it could be anything - a database, message queue, whatever you want.
I'm open to suggestions for a more ergonomic API if you have a specific use case in mind.