Notes on DeepSeek(twitter.com)

91 pointsby vinhnx5 hours ago10 comments

quadruple4 hours ago
Post appears to have been removed, I caught a copy of it: https://pastebin.com/rcAqEFG1
I assume it will get reposted at some point.
- swyx3 hours ago
  thanks. this really isnt that long, might as well paste in full here since OP deleted.
  Notes on DeepSeek:
  We visited the company HQ last Tuesday. It was founded in 2023 by Liang Wenfeng and operated out of his hedge fund, High-Flyer, until somewhat recently. The company released their R1 model in January 2025, so it was interesting to see what they’ve been doing
  The company is located in an unmarked, 12-story building in Hangzhou. There is no DeepSeek branding visible from the street or lobby. I asked why this is, and the team demurred and said, “Well, there are many companies in this building, and we are not special.” They want to keep a low profile.
  We met with their Head of Data and Head of Infrastructure. The company only has 300 employees. They are at least an order-of-magnitude smaller than Anthropic, and don’t care to scale further just yet. Their Head of Infrastructure, in particular, was young; maybe 30 years old and apparently one of the best AI buildout and energy experts in the country. (We briefly walked through the labs, and everybody seemed young. There was a lot of discussion; it felt like an exciting and energetic place.)
  Lots of competition is coming from Alibaba (Qwen), ByteDance, and Moonshot (Kimi). People in China seem to mostly use Kimi or Deepseek. Young people use VPNs to access Claude, though Anthropic has blockers around usage in China and make it difficult. Poaching between groups is common, just like in the U.S. DeepSeek has a reputation as being really smart and “cool,” maybe similar to Anthropic. Big labs are mostly in Beijing, near Tsinghua and Peking University, with Hangzhou as the main exception (DeepSeek and Alibaba/Qwen are there).
  The DeepSeek team reads western AI writers. They listen to Dwarkesh and read Gwern. The people we met with said they had never met with any employees from Anthropic. They were not at all concerned with some kind of hostile / AGI takeover scenario. They kept bringing up job loss (which is already high amongst youth in China) as their main concern. When we asked if they do red teaming on their models, they said no. In China, AI models are not regulated directly; the government instead has restrictions on how those models can be used in software, services, etc.
  As a whole, China seems to treat AI as just another technology, rather than as some kind of singularity moment. National attention is still on basic needs and infrastructure buildouts, and on providing more medicines for people. The “dreams of singularity" seem like a luxury or distant consideration.
  We asked the DeepSeek team: “What has the highlight been so far? What are your plans for an exit?” And they said that their highlight and great achievement was R1. They did not gesticulate at a future model or vision, but rather seemed proudest of what they’ve already done. They are content for now to remain ~6 months behind U.S. companies while maintaining a lower profile and team size.
  - sinuhe693 hours ago
    I don't get the part of "AI models are not regulated directly, the government instead has restrictions on how those models can be used in software, services". Is it not the same thing? When I chat with DeepSeek about any (Chinese) political/social issue, it immediately begins aligning with the party's line or just cut off the conversation abruptly.
    throwaw123 hours ago
    It is not, just downlaod the model and ask same questions.
    mortenjorck3 hours ago
    I think that's less the result of any regulation specifically targeted at AI and more Chinese labs interpreting longstanding, broad regulation around "preserving social harmony" as it relates to post-training.
    throawayonthean hour ago
    isn't that exactly what the quote says? the software service (presumably their web chat) has restrictions that the model itself does not
  - sometimelurker2 hours ago
    > They were not at all concerned with some kind of hostile / AGI takeover scenario.
    this doesn't sound belivable, or at least it seems off. competent ai engineers should have good intution about how agents work, and what happens when they don't do what you want them to do: https://www.forbes.com/sites/boazsobrado/2026/03/11/alibabas...
  - lofaszvanitt3 hours ago
    They listen to Dwarkesh
    oh jesus, that guy and his absolute baloney, empty interviews.... sigh.
  - 3 hours ago
    undefined
alecco4 hours ago
I remember reading a similar tweet explaining DeepSeek breaks the insane Chinese work culture. They are against 996 and brutally grinding employees. They feel like a big family and that is their hedge against poaching by Chinese Big Tech with bigger salaries. Liang Wenfeng seems to be the only AI CEO down to earth. I want to believe.
gbraad5 hours ago
Not sure what I read, but sounded like a lunch meeting description; felt void of actual information, with the restaurant replaced by the office. I am in China and can tell it is either Kimi, DeepSeek or Claude (proxied or actually deepseek/fake). The bigger push for the general public died down a lot since last year; kids were pushed to use AI for homework, now it is disallowed and frowned upon. In short mixed messaging.
- sinuhe693 hours ago
  With government billions fund pushed for AI build out, fast pace integration on large scale and sweeping national education reform for AI, I don't think it can be called "died down".
  [0] https://www.reuters.com/world/china/china-prepares-295-billi...
  [1] https://www.globalneighbours.org/en/articles/china-unveils-n...
  [2] https://english.www.gov.cn/news/202606/10/content_WS6a296017...
- genewitch4 hours ago
  > kids were pushed to use AI for homework, now it is disallowed and frowned upon. In short mixed messaging.
  in the early 2000s in california universities you'd get marked down for citing wikipedia. so the good souls told everyone "see the number in brackets[2] after what you're trying to cite the article for? just click that then click the archive.org or whatever link there, then cite that."
  Now? i think wiki is considered a valid source? or has it flopped back to being "unreliable"?
  - sheept4 hours ago
    It's not that it's unreliable, it's just lazy research. Wikipedia, like all encyclopedias, is a tertiary source, but ideally your essay should be a mix of primary and secondary sources, while Wikipedia discourages original research and prefers only secondary sources. Wikipedia itself recommends against citing it as research[0] for this reason.
    [0]: https://en.wikipedia.org/wiki/Wikipedia:Citing_Wikipedia
    kaliqt3 hours ago
    Laziness should never be the issue.
    The issue is that Wikipedia can be wrong and you’d only know that by going to the source (or lack thereof), or checking other sources.
    pqtyw2 hours ago
    All secondary sources can be just as wrong, while standards of course might differ being published doesn't prove much on its own. Also of course in many/most non theoretical fields you find plenty of conflicting sources so relying on a "consensus" based high quality encyclopaedia article seems like a more reliable approach if you are new to the field and don't really understand what you are reading.
  - ValentineC3 hours ago
    I think Wikipedia's still considered unreliable, but the question that should be asked is whether the author even read the source in "the number in brackets" to ensure that it's even backed properly.
    Just like how people should use AI for research, I guess.
- tyingq3 hours ago
  Were things like "300 employees" and descriptions of the deliberately low key hdq out there before? That counts as actual information to me.
- adampunk4 hours ago
  It’s a puff piece written by someone who didn’t know (or didn’t care) they were being managed.
  - gbraad4 hours ago
    "Like this, read my blog" — said DeepSeek
bel85 hours ago
From the notes, they seem humble and empathic.
We're lucky to have China imposing competiton to the western AI megacorps.
If it wasn't for China, I would probably have to spend $100/mo on AI instead of $10 like I do currently while using DeepSeek and MiMo (opencode Go plan).
And while I could do so comfortably, I feel for those who can't. It must feel incredibly isolating to only watch others have access to expensive models to leverage their careers.
I hope SoTA AI becomes an universal right because it will contribute to too much income disparity otherwise.
- yeodev34 minutes ago
  Ever since I found Opencode Go AI coding is fun. I always hate the feeling of working inside a fenced constraint where if I just go hard enough I suddenly hit a wall and have to pay up a LOT more.
  It's crazy how much you get out from Deepseek V4 Flash alone.
- alecco4 hours ago
  > We're lucky to have China imposing competiton to the western AI megacorps.
  The second they get a hold of the market, Chinese Big Tech will be as bad or worse than US Big Tech.
  We're lucky to have DeepSeek.
  - slaw4 hours ago
    In every market China dominates, Chinese products are still inexpensive. Solar panels, batteries, EVs, drones,..
    hsuduebc23 hours ago
    Because they are subsidized by the Chinese government. This is literally a tactic to destroy global competition.
    It's a smart move to make everyone dependent on them.
    throawayonthean hour ago
    1. this seems to be based on misconceptions about how the chinese economy works 2. why haven't they done it yet? is the implication that they will wait until they're dominant in some x number of industries worldwide and then... raise prices?
    p.s. how would such "subsidization" work on a such a scale? if you think the EVs, PV panels, etc are cheap because the govt like, just covers the loss on every sale(?) where do they get all that surplus finance to cover labour and resources?
    have you considered 'subsidies' can be used for accelerating R&D for national interest rather than some monopolistic plot
    manishsharan3 hours ago
    Any evidence to back that up?
    xyzsparetimexyz2 hours ago
    Fact 1: that's terror Fact 2: that's terror
    hsuduebc22 hours ago
    Back what exactly? That China subsided it's domestic companies?
    https://www.oecd.org/en/publications/subsidies-and-the-solar...
    https://www.oecd.org/content/dam/oecd/en/publications/report...
  - graphime4 hours ago
    [dead]
- Qhemlomo5 hours ago
  I see this problem already for me.
  I have unlimited tokens at work than i go home what do i do? Spend 200$ per month? No def not.
  When Anthropic increased the limits for their 20$ plan, i started again coding with it on a private project and it was fun and i did a lot in that 4 weeks.
- cmrdporcupine5 hours ago
  Yep. After yesterday's moves around "Fable 5" even twice as much.
  We've had a taste, and damned if I'm going to have the "means of production" snatched from me already?
  - genewitch4 hours ago
    approximately how many months/years until there are "illegal models"?
    kennywinker3 hours ago
    You wouldn’t steal a brain
    cmrdporcupine3 hours ago
    ... watch me ;-)
- flowbarai17 minutes ago
  [flagged]
- Helloworldboy4 hours ago
  [dead]
cmrdporcupine5 hours ago
"As a whole, China seems to treat AI as just another technology, rather than as some kind of singularity moment."
This is a refreshing perspective.
- sinuhe693 hours ago
  The CCP is very active in the matter of AI. In fact, the DeepSeek moment was responsible for Xi calling for a private meeting with tech bosses, including the exiled Alibaba founder Ma. Which is practically unheard of in China politics.
  I don't have enough information to say whether the Chinese leadership sees AI "just as the next technology" or they are more cautious due to its double-sword nature. But the immense efforts for building their own AI/GPU chips plus government's billions fund pushed for AI build out, a directive for fast pace integration on large scale and a sweeping national education reform for AI, I don't think it can be seen as similar to other ordinary techs.
  [0] https://www.reuters.com/world/china/china-prepares-295-billi...
  [1] https://www.globalneighbours.org/en/articles/china-unveils-n...
  [2] https://english.www.gov.cn/news/202606/10/content_WS6a296017...
  - SXX2 hours ago
    > I don't think it can be seen as similar to other ordinary techs.
    Not saying its a bad thing, but US and EU limited exports of chips and litography equipment to China for decades.
    There is literally nothing else China can do to secure their supply of chips. They would do it even without AI bubble.
    Its military tech now and this is not just about LLMs. Autonomous flying killbots need GPUs too.
  - pstuart2 hours ago
    There's plenty to not like about the CCP, but their strategic investment in the country as a whole is impressive. It would be great to have that on our side as well but with the current state of things that is a non-starter.
- flawn5 hours ago
  The CCP knows, whatever the heck this technology will bring with itself, the current power dynamic inside of the country is on their side, and AI will solidify it.
  I hypothesize that, rather than slowly having it disperse in society and allow people to harness it in ways they don't want, they might as well accelerate everything until AI becomes the totalitarian swiss knife - which they can make use of in the best way of course.
  Let's see what will happen.
  - culi3 hours ago
    US used AI (Claude on Maven) to determine a girl's elementary school as a target in war[0] and then triple tapped it and you're still more worried about hypothetical misuses of the single country responsible for this technology not being concentrated in the hands of a few powerful elite? ffs
    [0] https://www.washingtonpost.com/national-security/2026/03/11/...
    SXX2 hours ago
    Its horrible event, but it was hit because Iranian regime builds schools and hospitals across the street from military bases.
    Nothing to do with AI and can happen in any war. Do some research, check sattelite imagery:
    https://goo.gl/maps/ZoAXkw1iFwyF7exQ8?g_st=ac
    PS: I not trying defend bombing schools, but posting that its "AI" resposible is opposite of what you need to do if you care.
    Its military - there been specific people who found this location for the strike, then some senior officers who choose it without checking and specific people who executed it. And its all logged with "paper" trail in chain of command.
    It was all people with specific names who are responsible to avoid bombing schools. They failed. Not "AI".
    culian hour ago
    The U.S. operates over 160 public schools physically located on military installations
    https://oldcc.gov/our-programs/public-schools-military-insta...
    I never said AI is responsible. I pointed out the US is clearly the one using AI in dystopian ways.
  - cmrdporcupine5 hours ago
    I don't really see how open weights models further what you're talking about.
    It's trivial for me to download one of their models and run it on my Spark, and there's all sorts of ways to strip out their Tiananmen-denialism or whatever.
    If/when the memory price crunch dissipates, even more so. And so far it's only China I see as making moves to increase production capacity on memory, too.
    If anything the centralization of capital into US-based Anthropic and OpenAI is far more terrifying from the perspective you're outlining.
- infecto4 hours ago
  China is probably more capitalist in many respects than the west these days. AI, robotics and automation is a way to push into the future. In the west we have endless researchers stuck in a psychosis that they are talking to a sentient being.
- SockThief4 hours ago
  "National attention is still on basic needs and infrastructure buildouts, and on providing more medicines for people. The “dreams of singularity" seem like a luxury or distant consideration."
  Further on. Refreshing indeed.
- surgical_fire4 hours ago
  Especially here on HN, where AI anxiety (especially amongst those that are really nervous that it needs to succeed) is very, very tiresome.
zkmon4 hours ago
Why would the agent send the results of the query "Show me my recent transactions" to LLM? This pretty deterministic results which involve no LLM interpretation or decision making.
- forsalebypwneran hour ago
  wrong thread?
seydor5 hours ago
US AI is almost a religious cult. It's devastating that they are treating it as a petty commodity
- alecco4 hours ago
  Altman used to talk about making a religion and Dario Amodei constantly talks about "building a God" and meets with religious leaders including the Vatican.
  > It got me thinking, though--the most successful founders do not set out to create companies. They are on a mission to create something closer to a religion, and at some point it turns out that forming a company is the easiest way to do so. [1]
  [1] https://blog.samaltman.com/successful-people
- windexh8er4 hours ago
  I would argue the US providers have gone full tilt into sales culture with respect to AI. Anything is said on a whim to redirect attention back from whomever is in the limelight. Initially I thought Anthropic was more pragmatic, but the constant release cycles of things that don't exist for most people, the gatekeeping, the statements made by Dario, it's all a part of large brand toxic sales and marketing.
  From the notes this part sat with me as the real difference:
  > As a whole, China seems to treat AI as just another technology, rather than as some kind of singularity moment. National attention is still on basic needs and infrastructure buildouts, and on providing more medicines for people. The “dreams of singularity" seem like a luxury or distant consideration.
  Meanwhile... In the fantasy land over here in the US we're constantly being told that it's "coming", "almost here", "too powerful for us to give you access to", "of national security importance!". Or... FUD.
  And while there may be trace amounts of truth in those overzealous statements we haven't seen a significant improvement in much outside of software development comparative to the spend and environmental impact.
cmrdporcupine5 hours ago
https://xcancel.com/NikoMcCarty/status/2064686557400100884
vinhnx4 hours ago
It seems the OP has removed the tweet somehow.
- davidwritesbugs3 hours ago
  someone kept a copy: https://pastebin.com/rcAqEFG1
dude2507115 hours ago
"Their Head of Infrastructure, in particular, was young; maybe 30 years old and apparently one of the best AI buildout and energy experts in the country"
Expert in buildout or expert in distillation?
- seydor5 hours ago
  What's wrong with distillation? Wasn't GPT a distillation of the world's internet? That's how technology levels proceed, by recursively consuming the previous ones.
  - boristsr5 hours ago
    It's absolutely mind boggling to see claims of model distillation being theft, a class of attack, and all sorts of claims all the while Meta is in court for copyright violation, anthropic has had to settle a case with authors. With distillation "attacks" at least they paid API fees.
    ImprobableTruth4 hours ago
    Anthropic had to settle with authors because they literally pirated books! Their behavior regarding distillation is genuinely beyond parody.
    FergusArgyll4 hours ago
    There are 2 things worth separating.
    1) China distills and is therefore morally bad.
    As you rightly point out, that's not a great argument.
    2) China distills and is therefore possibly not that competent.
    I think that makes sense. If they only catch up to the frontier through distillation then 1) Their model will never be as good as the model they are distilling from. 2) They will never reach the frontier - they need someone else to do it first.
    _aavaa_4 hours ago
    This is literally a repeat of the whole “China only make low quality cheap stuff” argument.
    “All they do is copy.”
    And now, oops they are world leaders in EVs, batteries, solar, drones, just to name a few on the biggest consumer facing things.
    plasticsoprano3 hours ago
    "Success leaves clues"
    You gotta start somewhere and you can start at page 1 or page 10 and that time, energy and cost you saved starting 9 pages later can be put into making whatever it is you're building better than the original.
    The US, and every other country, is full of derivatives or straight up copies. No one is getting super mad at the generic cheerios at the grocery store. It's hypocrisy.
    Lerc4 hours ago
    >2) China distills and is therefore possibly not that competent.
    I think deepseek at least has done enough innovative work that you could grant them a baseline of competency.
    In general, there are enough papers coming out of China to suggest that there are quite a few people there who know what they are doing.
    FergusArgyll4 hours ago
    You're correct and I shouldn't have used the word competent. Perhaps "and is therefore not elite enough to be state of the art"?
    I also have a soft spot for deepseek because they write such readable papers. I don't have a degree in anything but with a little work I can understand their papers - which I really appreciate.
    But I still think my point stands - if you need distillation you won't be SOTA
    xyzsparetimexyz2 hours ago
    Deepseek models are on the Pareto frontier of cost/performance. Thats the far more important one than just making a top scoring model.
    surgical_fire4 hours ago
    > China distills and is therefore possibly not that competent.
    I heard that argument more than one year ago, when chain of thought and reasoning cycles started to be hudden to protect against distillation.
    Meanwhile, models as DeepSeek and MiMo are nothing short of excellent nowadays.
    Ever since I switched away from OpenAI to DeepSeek I never felt the need to go back.
    toraway3 hours ago
    Deepseek Flash V4 really was a "holy shit" moment and deserves the praise/hype it's been getting from users. I have a multi-tier subscription strategy I've maintained for the last year of: 1. $20-$30 plan from first Claude now Codex for "SOTA" 2. Gemini via the extra $10/mo or so from my Google One plan 3. a cheap fallback plan.
    Together it gives me plenty of head room/model performance for $40ish/mo, plus letting me compare the various models over time.
    Originally I'd been using the Z.AI plan (that I'm still grandfathered into for <1 yr) as my cheap plan but wasn't keeping up with the SOTA progress and is slow/limited now. So I subscribed to the Opencode Go plan and use Deepseek Flash V4 almost exclusively and it is insane how much usage I can get for $10/mo.
    I did the math on my Flash usage vs. what I'm paying Opencode and I'm typically not even exceeding $10 in API costs! So it's actually sustainable not rugpull pricing at least for me. I can pound it with requests/agentic loops and have it running for 30 min doing whatever the fuck and check back and have spent literal pennies for what would have cost $30+ on my work's Github Copilot plan.
    I know enterprise world works under different rules and isn't price sensitive in the same ways as an individual but I truly don't see how this is sustainable for the US AI giants in the long term to maintain like 25x+ markup for 1.25x performance benefit.
    IMO it does help explain the recent emphasis on secret, scary "super models" like Mythos to muddy the waters for decision makers with hype and FOMO at at time when companies are beginning to seriously scrutinize their token spending for the first time.
- simonw4 hours ago
  Blaming the head of infrastructure for distillation doesn't make sense to me.
- amunozo4 hours ago
  Tell me, where did OpenAI and Anthropic got their training data? From public sources using legitimate means? Don't make me laugh.
- ReptileMan4 hours ago
  Both. Both are good. Anyway this shows how full of shit Anthropic are - if Mythos was so advanced as they claim - distillation attacks just wouldn't work.