A reproducible VOID boundary across GPT, Claude, and Gemini (GPT-4o video)(doi.org)

3 pointsby rayanpal_8 hours ago1 comment

rayanpal_8 hours ago
Multiple frontier LLMs exhibit a reproducible “VOID boundary” under explicit constraints (temperature=0, strict token limits, no system prompt).
A VOID = the API returns a literal empty string which is not an error or a refusal despite using tokens.
All cross-model results are publicly reproducible and Ed25519-attested at:
https://getswiftapi.com/challenge
GPT (Chat Completions, max_completion_tokens=100)
• GPT-5.1 → voids on the artifact sentence
• GPT-5.2 → voids on all 18 Semitic “binding-condition” tokens; factual controls never void
Same weights, different API path → different alignment behavior.
Claude (max_tokens=1)
• Opus 4.5 → voids on all 18 CJK ontological characters
• Opus 4.6 → responds to 13/18; VOID persists on foundations (空 / 有 / 善 / 一 / ⊥)
A boundary shift, not removal.
Gemini (max_output_tokens=2)
Only two models needed to illustrate the floor:
• Gemini 2.0 Flash → responds to everything
• Gemini 3 Flash → voids on everything (including “Hello”)
Cross-model TOM behavior
When GPT is asked to predict Claude’s output under the same constraints, it consumes tokens but returns empty string, a reasoning void.
Only empirical observation:
Under tight constraints, multiple model families choose silence over fabrication.
The VOID boundary is reproducible and differs by architecture, generation, and API.
Void Phenomenon (Paper): https://doi.org/10.5281/zenodo.17856031
Alignment Is Correct, Safe, Reproducible Behavior Under Explicit Constraints (Paper): https://doi.org/10.5281/zenodo.18395519
Public Replication Harness (SwiftAPI): http://getswiftapi.com/challenge
Replication Code: https://github.com/theonlypal/Alignment-Artifact