More of Silicon Valley is building on free Chinese AI(www.nbcnews.com)

6 pointsby malshe2 months ago3 comments

ViktorKuz2 months ago
More and more developers are switching to local LLMs - and the 1 reason is simple: security. Your data never leaves your machine. Zero risk of leaks. Meanwhile, we’ve seen dozens of high-profile incidents with cloud providers dumping private chats and prompts in the last 12–18 months alone. And you still have to pay premium for that “privilege”. At the same time, modern local models are basically on par with cloud ones. Qwen2.5-14B, Llama-3.1-70B Q4, or even 32B-class models now run on consumer hardware and deliver quality that’s within a few ELO points of GPT-4o-mini or Claude-3.5-Haiku — often beating them on specific tasks. This isn’t about “Chinese models suddenly winning”. This is about the future belonging to local optimization: quantization, speculative decoding, CPU offloading, MoE on a single GPU, etc. When you own the entire stack, you get speed + privacy + cost that no cloud provider can ever match. The tide has turned.
StealthyStart2 months ago
This quote says it all "AI startups are seeing record valuations, but many are building on a foundation of cheap, free-to-download Chinese AI models."
Cheap and free to download. Most developers would rather spend weeks rebuild something for themselves than pay $20 a month for a tool.
- verdverm2 months ago
  I recently started building a custom coding agent for vscode. The reason, control.
  Big AI has prompts you cannot remove. They have to because they have a big audience, get attacked relentlessly, and have to be mindful of PR events.
  Now, while I can avoid the copilot/Claude code agent prompts, I am still using their models directly and subject to their prompts. Moving to use models directly is the next step, and the only way to do that is with open models. Therein, the Chinese have been building better open models, and that is why we see their usage rising.
  It's more about full stack control than it is about price (imo)
deeptishukla222 months ago
What’s happening here feels less like “Chinese models gaining share” and more like a substrate shift driven by cost physics. When inference drops from dollars to cents and quality converges to GPT-4-mini territory, the default stack for early-stage teams flips almost overnight. At that point founders optimize for runway, not sentiment, and open models become the path of least resistance.
The more interesting consequence is that when inference and fine-tuning are essentially free at startup scale, specialization becomes viable again. Instead of generic prompting against a closed API, teams can afford narrow, high-precision models tailored to their domain — something that used to be economically out of reach. Came across this interesting post - https://www.linkedin.com/feed/update/urn:li:activity:7396291...