The tool, which is documented here (https://mlx-optiq.pages.dev/) also implements the recently aanounced TurboQuant KV-Cache optimization, so in total this should greatly improve the quality of locally run LLMs.
Looking forward to an OptiQ release of the Gemma 4 family.