AlazarManakelew4 hours ago
Hi i'm Alazer, I wrote a benchmark/collection/agentic closed-loop system for authoring metal kernels that are efficient across a wide variety of m chip series(m1 to current m5).
On-edge inference starts with the fastest kernels, which required fast, deterministic, state-aware feedback loops for agents. This is the first for Metal and I hope people can contribute to this. I think more efforts should be made to improve efficient usage of apple silicon.