A few highlights: - Speculative Actions: parallel API execution, ~30% speedup - KVComm: share KV pairs instead of text, 30% of layers gets near-full performance - DoVer: intervention-driven debugging that flips 28% of failures to successes
Happy to discuss any of the papers or the framing. The decision matrix at the end maps each problem to a starting paper.