But to the actual question: A lot of people's gut instinct on how to solve this doesn't work. They start going down the road of "well, if I teach the AI about my legacy codebase, it will be smarter, and therefore more efficient." But all you wind up doing is consuming all of your available context, with irrelevancies, and your agent gets dumber and costs more.
What you actually need to do is tackle it the same way a human would: Break it down into smaller problems, where the agent is able to keep the "entire problem" within context at once. Meaning 256K or less (file lengths + prompt + outputs). Then of course use a scratchpad file that holds notes, file references, constraints, and line numbers. That's your compaction protection. Restart the chat with the same scratchpad when you move between minor areas.
Context is your primary-limited resource. Fill it only with what should absolutely need to be there, and nothing else at all.
Usually it's an iterative process; if done correctly you could end up with a much better codebase. Good luck!
In my company we have tried using claude for exactly the same task you have. The results were bad. We discovered a few interesting things, but most of the stuff was wrong: we had to dif through the code base the old way to confidently accept/reject what Claude was telling us. So we could have save a lot of time and money simply by doing it ourselves. As an upside we also learned about the codebase so now people rely on us for that (that feels good too)
Then for execution: Use plan mode. Let it always write a plan first, check it, correct it and only then allow it to implement it.
Try to break big tasks down in small substeps. As small as possible. Let it implement changes iteratively. Let it do a lot of local git commits. Both Codex and Claude Code use this as documentation as well.
Basically, treat it like a junior developer working under you.
Make sure appropriate tests are written for every code change.