Besides being even better at the caching, I'm not sure what benefits you'd get compared to just firing up OpenCode with the DeepSeek API yourself, it'll similarly do caching for sure and also "talks directly to api.deepseek.com" if that matters, and you'll get a much more mature harness.
Can you share the bridge. DeepSeek v4 is awesome paired with claude-code or opencode. I found that claude code costs me less than opencode and I am presuming this is due to a better engineered harness.
I only used it for a few hours to play around with stuff before the quota issue was fixed and I could resume using GPT models, and the bridge was coded by DeepSeek-V4-Flash-IQ2XXS + DwarfStar4 locally, I take no responsibility for what might happen with your computer or you, during usage or just reading the code.
Edit: heh, like don't look at line 117 for example where seemingly it likes to handle misspellings in the .env file which totally wasn't my fault for typo'ing the API key in that file... I'm sure there are tons of sharp edges and dumb stuff in there.
Obviously, if you do deal with any sort of secrets, then using local LLMs over OpenAI, Anthropic, DeepSeek or whoever is obviously preferred, and in the case of personal data of users, probably a requirement.
Same with codex? codex-rs at least, is a TUI as well, it does run a "app-server" in the background, that the TUI actually interacts with, but that's just an implementation detail. Also makes it easy to hook in your own programs to fire of codex "headless" sessions even without the TUI.
From the FAQ, I see:
>Can I point it at a self-hosted / private DeepSeek endpoint?
>Yes. Since 0.30 we accept non-standard key prefixes for self-hosted DeepSeek endpoints. Just point `baseUrl` at your internal address — the loop, cache strategy, and tool protocol are unchanged.
But my question is: If I use Reasonix to talk to a deepseek endpoint through openrouter, am I still getting the cache-hit benifits of this agent harness?
I'm concerned since i really want SOTA reasoning, but DeepSeek still has me interested.
I have been using it for a while, and I wholeheartedly agree. imo, it is as good as codex or claude which I also use. It is a winner in the cost-sensitive tier, and if some startup could put it together with data-retention in mind, it could be a great product sold to the enterprise, as data-retention and privacy are the main issues for the coding-assistant usecase.
Yes, but a lot of harnesses change previous context. E.g. the system prompt injects the current time/date, working directory, files in the working directory, etc. Compaction also changes the whole previous context. I _think_ changing the list of tools also invalidates cache, so invoking a subagent with different tools would invalidate the cache.
My vague impression is that it's in a similar vein to functional programming languages. It generally disallows doing things that lead to bugs (cache misses in this case), and presumably allows you to do those things in a way that makes it much clearer that this is likely to cause cache misses. I would guess that in this paradigm, you don't mutate your existing session, you derive a new session by mutating the prior context into a new context.
Is this really the behavior you want? Yes, doing tool-result clearing and such will blow your cache, but if you do it only occasionally, it's still likely a win. Yes, cache hits are good, but not so good that it's okay to be profligate with context to preserve those precious, precious KVs.
This is still art as much as science and the different harnesses take different approaches.