The multi-step turn savings are what make this really add up though. A single user message triggering 5-6 tool calls means 5-6 API calls where everything before the last tool result is cached. That's where you actually get close to the 10x number.
One thing I'd add: this pairs well with routing simpler turns to cheaper models entirely. Caching saves you on input tokens, but if the turn is straightforward enough that Sonnet or gpt-4.1-mini can handle it, you save on both input and output. The two approaches are complementary.