This way we can save Anthropic and co. for example the effort of recalculating all the linear algebra needed for the prefill of the system prompt, for which they reward us with reduced input token cost. The result is the same, if cached or not cached, it's just less computation.
For the same prompt you would still get the same answer (assuming temperature = 0).
The big savings you will get in an agentic/conversational context: Every new turn always puts the full message array back into the request. If we don't cache the calculation result at every step, the provider has to recalculate the early turns potentially hundreds of times (see second/third graphic).