> Swapping models and inflating tokens. Because users’ inputs and model outputs are mediated through a proxy, users cannot verify which model their request was actually routed to. A user selects Opus 4.7, but the proxy can silently route to Sonnet, Haiku, or, in the worst case, GLM or Qwen, and fraudulently relabel the output. In a recent paper from Germany’s CISPA Helmholtz Center for Information Security (which cited my article last year on grey market!), researchers audited 17 API proxies and found widespread model swapping–API proxy access to “Gemini-2.5” achieved only 37.00% on a medical benchmark, a staggering drop from the 83.82% performance of the official API. On the user end, the tell only comes on complex tasks, when the output feels off (often referred to as 降智, or “dumbed-down”), but there is no clean way to prove it. Numerous public records highlight concerns that certain API proxies have noticeably compromised model performance. These proxies are suspected of “diluting” (掺水) services by substituting premium frontier models with inferior tiers.
So no, those cheap tokens won't necessarily be Claude.
Odd that they'd risk getting screwed over like that, when DeepSeek v4 Pro is pretty okay nowadays for quite a few tasks. I guess it's a bit like OpenRouter, where I get to try out all sorts of models with relatively few hassles (though nobody will give me a discount), but I have to acknowledge that some providers will straight up quantize the models so far that they're borderline unusable.
Still, personally i think there's one piece missing in the article. Why would it be OK to restrict chinese users from using american models? I mean, personally i'm strongly anti-AI and i believe all AI companies need to die because they enhance the worst humanity has to offer. However, if AI is going to be legal, how can it be ethical to discriminate based on one's country? Especially if said country (China) is the one refining 90% of the minerals and rare earths the US uses to produce its computers.