Breakthrough result — March 01 2026 On standard commodity CPUs with ROLV, full Kimi K2.5 serving achieves:
Baseline without ROLV: 0.10 req/s • 74.39 output tok/s • 1,380.71 total tok/s • 1,039.99 s wall time ROLV Accelerated: 4.37 req/s • 3,253.47 output tok/s • 60,385.79 total tok/s • 23.78 s wall time • 206 ms mean TTFT Kernel acceleration: 43.7× faster than dense baseline IMPROVEMENTS WITH ROLV
Requests/sec increase: 43.7× (+4,273.5%) Output tokens/sec increase: 43.7× (+4,273.5%) Total tokens/sec increase: 43.7× (+4,273.5%) Wall time reduction: 43.7× (97.7% faster) TTFT mean reduction: 43.7× (97.7% faster) TTFT median reduction: 43.7× (97.7% faster) End-to-end latency reduction: 43.7× (97.7% faster) Per-request TPS mean increase: 43.7× (+4,273.5%) KERNEL ENERGY MEASUREMENTS (for 200 iterations) Dense baseline: 18,992.76 Joules | ROLV accelerated: 339.77 Joules | Energy saved: 98.2%
Result: Commodity CPUs with ROLV now beat a single NVIDIA B200 GPU without ROLV by a massive margin — while using far less power and zero specialized hardware.