* a data-free LLM quantization method which they claim outperforms all prior data-free approaches, including NF4; and
* a method which they claim is optimal for finding non-uniform per-layer quantization levels which match a given compression constraint in the "medium bitwidth" regime.
They demonstrate improved accuracy-compression trade-offs on popular LLMs.
Thank you for sharing this on HN.
Bringing this up because the abstract (and the mention of rotations) reminded me of recent LLM interpretability posts.