On-edge inference starts with the fastest kernels, which required fast, deterministic, state-aware feedback loops for agents. This is the first for Metal and I hope people can contribute to this. I think more efforts should be made to improve efficient usage of apple silicon.