Hello, author here excited to discuss with everyone. so the context is More agents are running for longer turns and at greater context length. This is A * T * C ~= Cubic problem. Unaddressed by SOTA approaches. "fak" Fused agent kernel solves this turning O^3 into O(n). So far proven on small to middle sized tasks at ~4x SOTA. Unlike most other approaches there is no real downside or effort required to get the benefit. While its' still early in terms of proving it out on large scale and hyperscaler workloads these early results show promise. Thank you