DeepClaude – Claude Code agent loop with DeepSeek V4 Pro, 17x cheaper(github.com)

282 pointsby alattaran7 hours ago31 comments

aftbit6 hours ago
```
    #!/bin/sh
    export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
    export ANTHROPIC_AUTH_TOKEN=sk-secret
    export ANTHROPIC_MODEL=deepseek-v4-flash
    export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
    exec claude $@
```
- rapind3 hours ago
  ANTHROPIC_MODEL=deepseek-v4-pro[1m] ANTHROPIC_SUBAGENT_MODEL=deepseek-v4-flash
  This is what I’ve been using for non-confidential projects for about a week now (soon after v4 came out). I honestly can’t tell the difference, but I’m not doing anything crazy with it either.
  Worth noting that I don’t think DeepSeek‘s API lets you opt out of training. Once this is up on other providers though… (OpenRouter is just proxying to DeepSeek atm)
- varencan hour ago
  The more interesting part of deepclaude is the local proxy it runs to switch models mid-session and do combined cost tracking. Though these features seem quite buried in the LLM-generated readme. Looking at the history, it appears they were added later, and the readme wasn't restructured to highlight this.
  Also, the author checked in their apparently effective social media advertising plan: https://github.com/aattaran/deepclaude/commit/a90a399682defc... (which seems to be working)
  - yard201043 minutes ago
    How come such slop is allowed here, what value do these vibe coded zero shot "projects" add? Why not just post the prompt?
    fragmede8 minutes ago
    Convenience? Am I supposed to take the prompt and use my own tokens on it? Why should I have to do that?
    otabdeveloper435 minutes ago
    Recruiters used to use the candidate's Github "sources" page for evaluating candidates as a kind of proof-of-work.
    groestl10 minutes ago
    And recruiter agents still do.
- aaurelions6 hours ago
  It seems like any project that makes fun of Claude is bound to reach the top spot on Hacker News. Even if it’s just a project consisting of four lines of code.
  - 5 hours ago
    undefined
  - ihsw5 hours ago
    [dead]
- spirit232 hours ago
  So I created https://getaivo.dev, one can use model in the coding agent directly. Just `aivo claude -m deepseek-v4-pro`
- btbuildem4 hours ago
  This in essence is what allows one to use any model with CC -- including local.
- nadermx6 hours ago
  The AI wars have begun
  - stingraycharles3 hours ago
    This has been possible since the beginning.
  - 41 minutes ago
    undefined
- faangguyindia2 hours ago
  those who use deepseek v4, what level of output you get? Codex 5.3 or GPT 5.4?
  is flash version on level of gpt 5.4 mini
vitaflo6 hours ago
I'm not exactly sure what the point of this is. Deepseek already has instructions to use its API with many CLI's including Claude Code directly:
https://api-docs.deepseek.com/quick_start/agent_integrations...
- varencan hour ago
  The readme absolutely buries the features that are actually non-trivial: It runs a proxy to switch models mid-session, and does combined cost tracking between Anthropic and other models you might be using. The LLM that wrote the readme never updated the general project description to highlight these features.
  Also the author checked in their advertising plan: https://github.com/aattaran/deepclaude/commit/a90a399682defc...
- 2ndorderthought6 hours ago
  There probably isn't a point. Someone didn't understand something, didn't research it, so they 1 shotted their first thought and sent it to the front page of HN and all of their socials. It's the future bruh
  - georgeburdellan hour ago
    I embrace it at this point. It ends all the shilling of vibe coded tools at work that I have endured over the past year. Everyone can now make their own tools with zero obligation to coordinate beyond shared hardware resources
  - altmanaltmanan hour ago
    To be fair, HN sent it to the front page, not the user. The rest I agree.
- croes5 hours ago
  From vibe coders for vibe coders
  - 2ndorderthought4 hours ago
    I don't always copy paste vibe coded project readme mds into Claude code and ask them to rewrite it but when I do... actually that's all I do now because my goal in life is to make wealthy overvalued companies wealthier.
  - kordlessagain3 hours ago
    Problem?
    40 minutes ago
    undefined
- ttoinou6 hours ago
  I thought the tool format wasnt exactly the same ? So plugging any IA into claude code requires a conversion of format
  - selcuka5 hours ago
    DeepSeek has a dedicated Anthropic-compatible endpoint [1].
    [1] https://api-docs.deepseek.com/guides/anthropic_api
  - ricardobeat6 hours ago
    Many of them expose “anthropic-compatible” APIs for this very purpose.
  - faangguyindia2 hours ago
    qwen also offers openai compatible endpoint.
- crooked-v5 hours ago
  I'm curious how well it actually works. I tried Deepseek with Hermes and Opencode and it seemed extremely bad about using some of the basic tools given, like the Hermes holographic memory tools, even with system prompt instructions strongly pointing them out.
justech6 hours ago
If you're looking for Claude Code alternatives, I would first suggest looking into pi.dev or opencode for your harness. And then for models, you can choose from OpenCode Go (IMO most cost effect at this moment), OpenRouter, or direct from DeepSeek. Better if you go the Kimi route IMO and just buy a subscription from kimi.com
- wolttam5 hours ago
  I’m going to throw my harness in the ring: https://codeberg.org/mlow/lmcli
  - taocoyote2 minutes ago
    Looks interesting. Does it offer anything special that pi.dev or opencode does not?
- Aeroi5 hours ago
  agreed. OpenCode is a strong base, and with a couple modifications it can become a very effective harness. my sideproject mouse.dev I’ve been combining parts from OpenCode, Claude Code, and Hermes to build a cloud agent architecture that works well from mobile.
  - adobrawy21 minutes ago
    I'm a Claude Code Web fan and a rather heavy user. So I was interested in your product. However, I couldn't find an answer on the website. What parts did you find so good that you ported them?
  - CharlesW5 hours ago
    > OpenCode is a strong base, and with a couple modifications it can become a very effective harness.
    I personally didn't find it to be competitve with Claude Code as a harness. Can I ask how you modified it to perform better?
    Aeroi5 hours ago
    I haven’t run formal evals but i improved the experience for my own needs and it feels noticeably better with these modifications.
    -Claude-style subagents -an MCP layer for higher-level tools -Cursor-style control plane modes like Ask, Plan, Debug, and Build.
    The MCP layer lets the harness use things like GitHub file/code read, PR creation, web search/fetch, structured user questions, plan-mode switching, user skills, and subagents.
    So the improvement is mostly from better ui/ux orchestration and tool access. There's some things from hermes that are interesting as well.
    Most of my focus has been on applying this stack to sandboxed cloud agents so you can properly code and work from mobile devices.
    I can't definitively say that the stack is better or worse than Claude code, more just tuned for my use case I guess.
- bakugo5 hours ago
  > I would first suggest looking into pi.dev
  Looked into this one. Thought it was suspicious that it only had 7 open issues on github. Turns out they have a bot that auto-closes every single issue just because.
  I honestly have no words.
  - mikeocool2 hours ago
    Their process is outlined here: https://github.com/badlogic/pi-mono/blob/main/CONTRIBUTING.m...
    > Maintainers review auto-closed issues daily and reopen worthwhile ones. Issues that do not meet the quality bar below will not be reopened or receive a reply.
    Seems like not an unreasonable way to deal with the problem of large numbers of low quality issues being submitted.
    cromkaan hour ago
    Sounds like a perfect way to agitate the community going against the established culture like that.
    altmanaltmanan hour ago
    But how is it any different from keeping them open?
    Like if they are going to sort through all the issues eventually (like they claim), why not just close the ones that are not worthy when they get to them instead of closing all by default?
    Is it just so that the project doesnt have open issues on its github page? But they are open issues in reality because the maintainer will eventually go through them?
    Nothing is "unreasonable" in the sense that an open source project should have the right to do what it wants with its rules but its definitely a weird stance.
  - __cayenne__2 hours ago
    The maintainer, Mario, sometimes declares the repo is on an “issue holiday” where issues are auto closed. This particular holiday is because there is a big refactor coming up. In non holiday periods issues can be reported as normal.
  - skeledrew2 hours ago
    They have a pretty decent explanation.
    https://github.com/badlogic/pi-mono/blob/main/CONTRIBUTING.m...
    DetroitThrow10 minutes ago
    "Decent" is doing some work. This is going beyond any norms I've encountered in OSS to close issues by default via a LLM or an "issue holiday".
  - justinhj2 hours ago
    It's a very interesting project. Many popular open source projects are inundated with poor quality issues and prs, hence the defences they are starting to erect.
  - LPisGood4 hours ago
    The idea is for it to he extremely minimal which strikes me as a very opinionated stance, and not opinions I agree with.
- aaurelions6 hours ago
  Another very cost-effective option is Ollama Cloud. In a month of use, I only hit the 5-hour limit once, when I ran 8 agents simultaneously for 2 hours.
  - 5 hours ago
    undefined
  - kopirgan2 hours ago
    On which tier?
  - postatic5 hours ago
    definitely worth it - have both ollama cloud, opencode and hermes running to test them all out, working great so far.
_3456 hours ago
If you're okay with sonnet level performance, this sounds like a straight upgrade. But I find that sonnet messes up too much, that it ends up not being worth cost optimizing down to using it or another sonnet-level model. Glad to have this as an option though
- Culonavirus17 minutes ago
  We're not yet at a point of saturation when all the frontier models would be of somewhat comparable "intelligence" and we could decide which to use based on other factors (speed, effective context window etc.), so I honestly don't see why would you (as a company or an employee) not use the best available model with the highest (or at least second highest) thinking effort. The fees are not exactly cheap, but not that expensive either.
- 2ndorderthought6 hours ago
  A lot of people are having good experiences doing things like using opus for designing and using locally hosted qwen3.6 for implementation.
  I could see a serious cost reduction story by using opus for design and deepseek for implementation.
  Personally I would avoid anthropic entirely. But I get why people don't.
  - girvo6 hours ago
    Like me: that’s what I do. Either Opus 4.7 or GLM 5.1 for planning, write it out to a markdown file, then farm it out to Qwen 3.6 27B on my DGX Spark-alike using Pi. Works amusingly well all things considered.
    brianjking3 hours ago
    How are you interacting with GLM 5.1? Via the Claude Code harness? I really wish they'd release a fully multimodal model already.
    2ndorderthought6 hours ago
    How is glm 5.1? I have t tried it yet but have been meaning too
    girvo5 hours ago
    It's surprisingly good. Beats MiniMax 2.7 and Qwen 3.5 Plus in my testing (I haven't tested 3.6 plus though), quite handily. It's far better than Sonnet, and often equivalent to Opus for the web development and OCaml tasks I'm using it for. It definitely isn't Opus 4.7, but its far good enough to earn it's keep and is substantially cheaper.
    sshine4 hours ago
    I agree with this. And also: it uses more thinking time to reach this. So while you get a lot of tokens on their plan, the peak 3x token usage multiplier + the extra thinking means you run into the rate limit anyways.
    girvo4 hours ago
    True, though the $20 equivalent used for planning only I don’t hit those limits often, vs Claude where the Pro can literally hit limits with a single prompt haha
    aftbit6 hours ago
    What hardware are you using to power this?
    girvo5 hours ago
    > DGX Spark-alike
    Probably wasn't clear enough if you don't know what that is already, apologies
    It's an Asus Ascent GX10, which is a little mini PC with 128GB of LPDDR5X as shared memory for an Nvidia GB10 "Blackwell" (kind of, it's a long story) GPU and a MediaTek ARM CPU
    sterlind3 hours ago
    pulls up chair
    could you tell me the long story?
    edit: or wait, is it quasi-Blackwell the way all DGX Sparks are quasi-Blackwell? like the actual silicon is different but it's sorta Blackwell-shaped?
    girvo2 hours ago
    Yeah exactly. Shader model 121 is different to SM 120 (consumer Blackwell) and is different again to data centre Blackwell SM100.
    The promise of this chip was “write your code locally, then deploy to the same architecture in the data centre!”
    Which is nonsense, because the GB10 is better described as “Hopper with Blackwell characteristics” IMO.
    Still great hardware, especially for the price and learning. But we are only just starting to get the kernels written to take advantage of it, and mma.sync is sad compared to tcgen05
    aftbit5 hours ago
    Ah yeah I saw that, I was just curious which particular mini-PC you were using. I was considering picking up one of the various AI Max 395 boxes before the RAMpocalypse but didn't take the plunge. Thanks for the response!
    girvo4 hours ago
    I heavily considered one of the AMD Strix Halo boxes, but part of the reason I wanted this was to learn CUDA :)
- maxdo3 hours ago
  This is the problem: you need the best model, not just a good one, for: - Good architecture, which requires reading specs, code, etc. reads like: lots of tokens in/out - Bug fixing — same, plus logs, e.g. datadog
  Once you've found the path, patches are trivial and the savings are tiny unless you're doing refactoring/cleanup.
  testing gets more and more complicated. Take a look at opencode go, and you see this:
  >Includes GLM-5.1, GLM-5, Kimi K2.5, Kimi K2.6, MiMo-V2-Pro, MiMo-V2-Omni, MiMo->V2.5-Pro, MiMo-V2.5, Qwen3.5 Plus, Qwen3.6 Plus, MiniMax M2.5, MiniMax M2.7, >DeepSeek V4 Pro, and DeepSeek V4 Flash
  and now on your own with bugs, all of these models can produce at scale. Am i missing anything in this picture. What is the real use of cheaper models?
- chrsw6 hours ago
  I keep re-learning this lesson: I chug along with a lesser model then throw a problem at it that's too complex. Then I try different models until I give up and bring in Opus 4.6 to clean up.
  - brianwawok5 hours ago
    And I keep using Opus to like, make git commits. Really just need a smart router that is actually smart, vs having to micromanage model
    sterlind3 hours ago
    the problem is managing the contexts. your session might fit in Opus, but will that smaller model you dispatch the git commit to fit? even so, will it eat too much on prefill? do you keep compactions around for this, or RAG before dispatch or something? how do you button back up the response?
    all doable but all vaguely squishy and nuanced problems operationally. kinda like harness design in general.
  - energy1232 hours ago
    It's not even that much cheaper, GPT 5.5 is about 2x more expensive per task than Deepseek v4 Pro when you adjust for less token usage, according to Artificial Analysis. Doesn't seem worth it to me.
- willio585 hours ago
  I don’t find this with sonnet at all. As long as I have a solid Claude.md and periodically review the output and enforce good code practices via basic CI gates I’ve rarely ever found myself having to switch to opus
  - 2ndorderthought4 hours ago
    You might be surprised then at how good cheaper models solve your problems
- 6 hours ago
  undefined
sowild_funan hour ago
Using a bunch of CLIs to work with DeepSeek V4, I've found that Langcli is the best fit for DeepSeek V4. For programming tasks, the cache hit rate is above 95%.
Not only can it seamlessly and dynamically switch between DeepSeek V4 Flash, V4 Pro, and other mainstream models within the same context, but it is also 100% compatible with Claude Code.
- sfewfwegan hour ago
  Langcli + deepseek v4 is very good
  - 19 minutes ago
    undefined
nclin_3 hours ago
Is claude code the best coding harness? Anyone running evals on that?
- ahmadyan3 hours ago
  In my anecdotal experience, it is not. Same model, opus, works better in 3P harnesses such as Factory Droid or Amp.
  Claude code, on the other hand, is the most subsidized one, both for consumers (through max subscription) and for enterprises (token discounts). It is also heavily optimized for cost, specially token caching and reduced thinking, at the expense of quality.
9999000009992 hours ago
I just spent half my day getting CUDA and LLAMA to work with my 5070TI.
I was able to use it in agent mode with Roo, I stopped after having it write out a plan, but I'll continue when I have more time.
Deepseek feels less likely to do a straight up rug pull since you can self host with enough money, but I'm still more excited about local solutions.
Usually I just need grunt work done. I'm not solving difficult problems.
dopeepsreaddocs4 hours ago
Did... Did you just ask an AI to one-shot something that normally amounts to no more than setting two env variables?
langitbiru2 hours ago
I'm wondering why DeepSeek didn't create an AI coding agent like Kimi Code.
alexdns6 hours ago
obviously vibe coded ( co authored ) + the prices dont even match
- 2ndorderthought6 hours ago
  It's going to be real hard to find headlines that weren't vibe coded from here on out unfortunately.
  - SchemaLoad6 hours ago
    Unless I actually know the author I assume everything here is vibeslop and full of mistakes.
    Maybe I need to switch to some news publication that actually does real research and writing still. Because public forums like this have been completely destroyed by LLMs.
  - cyanydeez6 hours ago
    welp, pack it it in boys, it was nice conceptualizing all you as real humans on the internet. I guess I'll just have to go touch grass if I want to feel parasocial.
    dragontamer6 hours ago
    I mean, we have the tech and community to actually build in person meetups and sign CRT certificates, right?
    If we touch grass in person and swap certificate requests, we can actually rebuild a trust network.
    This is a pretty old problem with regards to clubs / secret societies and whatnot. And with certificates / PKI, our modern security tools have solved all the technical problems.
    2ndorderthought6 hours ago
    I wish I could be invited to a secret club of guaranteed humans. Someone hand me a certificate next time you see me! Also don't stab me kthxbye
    cyanydeez5 hours ago
    Unfortunately, a lot of whats happening in the tech world seems to be from some super serious AI cults, so not sure goin offline like this is any better.
    2ndorderthought5 hours ago
    Yea but we could have fun. Play some dnd. Drink tea or whiskey. Eat pizza pie. Light saber battle. Buy a megaphone and hang out at a street corner telling passerbys they are perfectly acceptable and worthy of kindness and love
- inciampati5 hours ago
  poorly vibe coded. machines can check details easily, use them.
dbeley2 hours ago
Honestly with the likes of Opencode / pi / hermes I don't really find the "Claude Code agent loop" part particularly interesting.
The edge Anthropic has on others lies on its models performance. CLI tooling (and obviously pricing) is definitely not better than others.
- danny_codes2 hours ago
  Except the model isn't particularly better anymore, as compared to the newest wave of FOSS models
vagab0nd5 hours ago
This has become a problem for me. I like trying new things. But I also know that in about a week, there's going to be a better/cheaper setup. And a week after that. And ideally I'd like to get some coding done when I'm not tinkering with the tools.
So I think I'll stay with CC for now.
- kordlessagain3 hours ago
  CC has the ability to use Ollama as well, which includes the ability for Ollama to proxy to Ollama's cloud models. It's brilliant, and works with a single Ollama command that doesn't mess with CC at all (so you can run them at the same time).
  If you are interested, I've built an agentic terminal that helps manage these types of things better: https://deepbluedynamics.com/hyperia
6 hours ago
undefined
6 hours ago
undefined
orliesaurus6 hours ago
Is there a way to do this directly by using claudecode CLI (which I already have installed) and openrouter??
- vitaflo6 hours ago
  Yes, Deepseek even documents how:
  https://api-docs.deepseek.com/quick_start/agent_integrations...
- theanonymousone6 hours ago
  Yes, from Claude Code themselves: https://code.claude.com/docs/en/llm-gateway
- jubilanti5 hours ago
  Here's a oneliner:
  ANTHROPIC_BASE_URL="https://openrouter.ai/api" ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY" ANTHROPIC_DEFAULT_SONNET_MODEL="deepseek/deepseek-v4-flash" CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 claude
- 5 hours ago
  undefined
- 5 hours ago
  undefined
- 3 hours ago
  undefined
- gnat6 hours ago
  This repo's README explains how it works and you can do it yourself. claude looks for environment variables that say which API endpoint to talk to, which key to pass, which model name to use for haiku/sonnet/opus-level workloads, etc.
Lihh275 hours ago
the wrapper is basically env var glue. You’re still betting the whole loop on Anthropic's closed client.
game_the0ry5 hours ago
Cost engineering [1] will be the next hot topic for AI.
[1] A fancier way of saying "reducing cost."
triyambakaman hour ago
And if I don't care about cost, what about actual performance?
dukeofdoom2 hours ago
Is there some way to make claude/codex beep when it finishes a task.
portsentinel2 hours ago
I am now thinking how far can agentic AI can go how far we can achieve
6 hours ago
undefined
fHr4 hours ago
layer on layer on layer to refactor bunch of lines xD
5 hours ago
undefined
esafak6 hours ago
Why wouldn't you use something open source like OpenCode, which already support DSv4 and has more features than CC?
- CharlesW5 hours ago
  Coding harnesses make a big difference, and OpenCode is notably less effective than Claude Code (1) in my experience, (2) with the models I've tried it on. (I've not yet tried it with DSv4.)
- dlx6 hours ago
  As someone who does use other models with CC, I am curious about opencode, what extra features does it have that you find essential?
  - esafak6 hours ago
    I like being able to add a wide array of models, define perms for agents and subagents, turn MCPs on and off at will, and be able to fix bugs I find in it.
    dlx5 hours ago
    fair enough...any drawbacks that you've found?
    esafak5 hours ago
    Its UI isn't as slick, and it has bugs, but so does CC and you can submit a PR to have them fixed in OC.
- ttoinou6 hours ago
  More features than CC ?
  Also opencode tracks you by default. Its not safe. Every first prompt you send is routed through their servers, logged and they can use your data however they want
  - sedawkgrep6 hours ago
    I thought this was debunked awhile ago. ?
  - esafak6 hours ago
    I could not find any evidence of prompt logging. The code is open; can you point me to it?
2ndorderthought6 hours ago
Oh shoot now the next CC upgrade will blow your subscription for doing this
morpheos1376 hours ago
anthropic messed up big time harness works with any muh commodity LLM, meanwhile VCs were duped on the myth of FOOM AGI, probably not a cooincidence Anthropic is enmeshed with the scifi fan fic forum known as lesswrong. The world wants useful tools. The bay area bubble in contrast thrives on Mythos.
- hgyyy5 hours ago
  I think OAI and Anthropic will be ok for a year or two. But after that If they still continue to earn revenues from selling tokens to firms/software engineers they will be in serious trouble.
  The American firms are not demonstrating escape velocity and as long as china offers something somewhat comparable and offers it at a very low price to compensate for any difference in quality, they will not be generating enough in cash flows to finance reinvestment. I highly doubt they’ll be able to continue raising external financing for numerous periods from here on out - they gotta start showing strong financials and that they are running away from the open source models.
  - LeFantome4 hours ago
    The performance gap will likely close as Chinese hardware improves. This is happening very rapidly.
    Already DeepSeek v4 is being hosted on Huawei Ascend 950. What do you think those cost relative to NVIDIA gear?
  - morpheos1374 hours ago
    I wouldnt put it past the US gov to ban foreign models. they tried to ban tiktok. what is being demosrrated here is silicon valley can not withstand a competitive market.
    LeFantome4 hours ago
    Good luck banning Open Source models.
    Not only that but other countries are very unlikely to follow suit, so it is just a straight-up productivity tax on the US.
    morpheos1373 hours ago
    Yeah see the Nvidia china us gov self own. The assumption seems to be 1.4 billion people in a middle income country are dependent on 300 million for tech.
- bwfan1233 hours ago
  > anthropic messed up big time harness works with any muh commodity LLM
  that surprised me too. The intelligence is at the client, and by making that open, anthropic has commoditized the coding agent.
aliljetan hour ago
[dead]
kk_mors2 hours ago
[dead]
alattaran7 hours ago
[flagged]
volume_tech5 hours ago
[flagged]
deadbabe5 hours ago
I had a call with our CTO and we are pivoting away from Claude Code to DeepClaude because the cost savings are too substantial to ignore.