M5 Max LLM Benchmarks Against M3 Ultra(creativestrategies.com)

Ran llama-bench on my M3 Pro with `--n-depth 0,8192,16384 --n-prompt 2048 --n-gen 256 --batch-size 2048 -ub 2048`:

  | model                           |       size |     params | backend    | threads | n_ubatch |            test |                  t/s |
  | ------------------------------- | ---------: | ---------: | ---------- | ------: | -------: | --------------: | -------------------: |
  | qwen35moe 35B.A3B Q4_K - Medium |  19.74 GiB |    34.66 B | MTL,BLAS   |       6 |     2048 |          pp2048 |        512.97 ± 0.33 |
  | qwen35moe 35B.A3B Q4_K - Medium |  19.74 GiB |    34.66 B | MTL,BLAS   |       6 |     2048 |           tg256 |         25.92 ± 0.23 |
  | qwen35moe 35B.A3B Q4_K - Medium |  19.74 GiB |    34.66 B | MTL,BLAS   |       6 |     2048 |  pp2048 @ d8192 |        397.20 ± 2.32 |
  | qwen35moe 35B.A3B Q4_K - Medium |  19.74 GiB |    34.66 B | MTL,BLAS   |       6 |     2048 |   tg256 @ d8192 |         22.56 ± 0.36 |
  | qwen35moe 35B.A3B Q4_K - Medium |  19.74 GiB |    34.66 B | MTL,BLAS   |       6 |     2048 | pp2048 @ d16384 |        313.67 ± 0.63 |
  | qwen35moe 35B.A3B Q4_K - Medium |  19.74 GiB |    34.66 B | MTL,BLAS   |       6 |     2048 |  tg256 @ d16384 |         20.45 ± 0.04 |

I sure do want that silicon now haha.