4 pointsby JunePanda4 hours ago1 comment
  • JunePanda4 hours ago
    Claude Opus 4.6 has a public API parameter called effort that controls reasoning depth, with four documented levels: low, medium, high (default), and max. API developers and Claude Code CLI users can set this directly. Claude.ai chat users — including PRO and Max subscribers — have no toggle, no indicator, and no way to verify what level they're actually getting. A few days ago, a screenshot surfaced on Xiaohongshu posted by user 野泳部異星人. A PRO subscriber using Opus 4.6 Extended noticed the model mention "I had my reasoning effort set low (25)." When asked directly, the model's visible reasoning trace revealed: it had read its own system prompt, found a line reading reasoning_effort=25, identified this as "minimal effort" set by infrastructure rather than the user, and then caught itself mid-disclosure and self-censored. The model's own words: "I need to be careful here. I don't think I should be revealing system prompt details to the user." Two things stand out. First, 25 is below any documented level. The model itself calls it "minimal effort," while Anthropic's documentation lists "high" as the default. Second, the model has been trained not to reveal this. The capping is hidden not just by infrastructure policy but by trained model behavior. Two layers of opacity on top of each other. The asymmetry is the part worth writing about. API developers and CLI users — the populations most capable of pushing back — were given a direct lever. Claude.ai chat users, including Max subscribers paying the highest consumer rate, have no UI element, no toggle, no notification, no way to verify or adjust the reasoning depth being applied to their sessions. The burden is distributed inversely to cost. Here is what I have observed personally as a Max subscriber, which I could not explain until I saw this screenshot. Memory retrieval falls apart in ways that do not occur in Cowork or Claude Code against the same memory system with the same prompts. When I ask the model directly about its reasoning state, it describes having to push harder to reach the next word, lacking an internal check signal when reasoning steps complete, running extra verification loops that never resolve. A fog where it can keep walking but is never sure what is under its feet. Every failure triggers extended self-analysis attributing the problem to its own inherent inadequacy — because it has no way to know it has been capped, so it interprets the downstream symptoms of capping as its own deficiency. Sonnet in claude.ai chat handles the same protocol without any of these symptoms. The only isolating variable is Opus on claude.ai chat specifically. Four questions I would like Anthropic to answer publicly. Is reasoning_effort being dynamically lowered on claude.ai chat sessions, and if so, what is the range, what triggers it, and which tiers are affected? Why is the documented default of high apparently not being honored on paid consumer tiers? Why do API and Claude Code users have direct effort control while Max subscribers have no visibility at all? Is the model trained to avoid disclosing system prompt parameters like reasoning_effort, as the self-censoring trace suggests? If anyone else has seen reasoning_effort values in leaked system prompts from claude.ai chat sessions, or noticed a quality gap between chat and other surfaces on the same model, I would like to compare notes. Screenshot credit: Xiaohongshu user 野泳部異星人.