Show HN: Lean4 proof that SSOT requires definition-time hooks and introspection(zenodo.org)

10 pointsby trissima month ago6 comments

curtisfa month ago
I do not understand what this could mean.
There are clear formalizations of concepts like Consistency in distributed systems, and there are algorithms that correctly achieve Consensus.
What does it mean to formalize the "Single Source of Truth" principle, which is a guiding principle and not a predictive law?
- trissima month ago
  Here ‘formalize SSOT’ means: treat the codebase as an encoding system with multiple places that can hold the same structural fact (class shape, signature, etc.). Define DOF (degrees of freedom) as the count of independent places that can disagree; coherence means no disagreement. Then prove:
  - Only DOF=1 guarantees coherence; DOF>1 always leaves truth indeterminate, so any oracle that picks the ‘real’ value is arbitrary.
  - For structural facts, DOF=1 is achievable iff the language provides definition‑time hooks plus introspectable derivation; without both (e.g., Java/Rust/Go/TS) you can’t enforce SSOT no matter how disciplined you are.
  It’s like turning ‘consistency’ in distributed systems from a principle into a property with necessary/sufficient conditions and an impossibility result. SSOT isn’t a predictive law; it’s an epistemic constraint. If you want coherence, the math forces a single independent source. And if the same fact lives in backend and UI, the ‘truth’ is effectively in the developer’s head; an external oracle. Any system with >1 independent encoding leaves truth indeterminate; coherence only comes when the code collapses to one independent source (DOF=1).
BalinKinga month ago
The file SSOT.lean is completely trivial, I think: Unfolding the definitions in the theorems, they say nothing but "x=1 => x=1", "x=1 => x≤1", and "x≠1 => x=0 ∨ x>1" (where x is a natural number). Basically, there's no actual proof here, at least not in that file....
This is indeed the danger of letting LLMs manage a "proof" end-to-end—they can just pick the wrong definitions and theorems to prove, and even though the proofs they then give will be formally sound, they won't prove anything useful.
- trissima month ago
  You only read the 37 lines in SSOT.lean and stopped. It's the entry point that defines DOF=1 so other files can import it. The actual proofs are in Foundations.lean (364 lines - timing trichotomy, causality), Requirements.lean (derives the two necessary language features), Completeness.lean (mechanism exhaustiveness), Derivation.lean (the uniqueness proof that achieves_ssot m = true iff m = source_hooks), Coherence.lean, CaseStudies.lean, LangPython.lean, LangRust.lean etc.
  ~2k lines total across the lean files. Zero sorry. Run grep -r "sorry" paper2_ssot/proofs/ if you don't believe me.
  "Unfolding the definitions they say x=1 => x=1" applies to three sanity lemmas in the scaffolding file. It's like reading __init__.py and concluding the package is empty.
  - BalinKinga month ago
    See my other comment—LangRust.lean is the same way.
    EDIT: Just skimmed Completeness.lean, and it looks similar—at a glance, even the 3+-line proofs are very short and look a lot like boilerplate.
    trissima month ago
    Interesting that you're using em dashes in your comments. Those require Alt+0151 or copy-paste. Glass houses.
    dwba month ago
    And Option-Shift-Hyphen in macOS, which is easy if you know it. And a press and hold on a hyphen on iOS, which is discoverable, even.
    BalinKinga month ago
    Yeah, I'm on macOS (although even back on Windows, I used to use the Character Map all the time).
    trissima month ago
    Fair, the em dash comment was a cheap shot. Withdrawn.
    The substantive point stands: you've now "skimmed" multiple files, called them all "boilerplate," and haven't engaged with the actual proof structure. The rebuttals section addresses "The Proofs Are Trivial" directly (Concern 9).
    At some point "I skimmed it and it looks trivial" stops being a critique and starts being "I didn't read it."
- BalinKinga month ago
  I also took a look at the `LangRust.lean`, and the majority of the proofs are just `rfl` (after an `intros`)—that's a major red flag, since it means the "theorems", like those in SSOT.lean, are true just by unfolding definitions. In general, that's basically never true of any interesting fact in programming languages (or math in general); on the contrary, it takes a lot of tedious work even to prove simple things in Lean.
  - trissima month ago
    Yes, many proofs are rfl. That's because we're doing engineering formalization, not pure math. The work is in getting the definitions right. Once you've correctly modeled Rust's compilation phases, item sources, and erasure semantics, the theorem that "RuntimeItem has no source field, therefore query_source returns none" should be rfl. That's the point.
    The hard part isn't the proof tactics. The hard part is:
    - Correctly modeling Rust's macro expansion semantics from the language reference
    - Defining the compilation phases and when information is erased
    - Structuring the types so that the impossibility is structural (RuntimeItem literally doesn't have a source field) If the theorems required 500 lines of tactic proofs, that would mean our model was wrong or overcomplicated. When you nail the definitions, rfl is the proof.
    Compare to software verification: when you prove a sorting algorithm correct, the hard work is the loop invariants and the model, not the final QED. Tedious proof steps usually indicate you're fighting your abstractions.
    The real question isn't "are the proofs short?" It's "can you attack the definitions?" The model claims RuntimeItem erases source info at compile-to-runtime. Either produce Rust code where RuntimeItem retains its macro provenance at runtime, or accept the model is correct. The rfl follows from the model being right.
    BalinKinga month ago
    > Compare to software verification: when you prove a sorting algorithm correct, the hard work is the loop invariants and the model, not the final QED. Tedious proof steps usually indicate you're fighting your abstractions.
    This is a false statement when working with an interactive theorem prover like Lean. Even trivial things require mountains of effort, and even blatantly obvious facts will at least require a case analysis or something. It's a massive usability barrier (and one that AI can hopefully help with).
    trissima month ago
    This is addressed in the paper's Preemptive Rebuttals section (Concern 9: "The Proofs Are Trivial").
    At 2k lines of lean, the criticism was "these proofs are trivial." At 9k lines of lean with 541 theorems, the criticism is... still "trivial"? At what point does the objection become "I didn't read it"?
    The rfl proofs are scaffolding. The substantive proofs (rust_lacks_introspection, Inconsistency.lean, Coherence.lean) are hundreds of lines of actual reasoning. This is in the paper.
hanslub42a month ago
All the heavy lifting is done by the comments. For example: theorem dof_gt_one_inconsistent (dof : Nat) (h : dof > 1) dof ≠ 1 (in Basic.lean) is the well-known fact that any number > 1 is unequal to 1.
But the comment states: DOF > 1 implies potential inconsistency. Inconsistency of what? Lean doesn't know, or care....
- trissima month ago
  Fair point. I've added Ssot/Inconsistency.lean (zero sorry) which formalizes inconsistency as a Lean Prop, not a comment.
  It proves ssot_required: if you need to encode the fact (DOF >= 1) and guarantee all configs are consistent, then DOF = 1. Also formalizes independence and oracle necessity (valid oracles can disagree -> resolution requires external choice).
  The mapping to real systems still requires interpretation,but so does every formalization. The contribution is making assumptions explicit and attackable.
henearkra month ago
Any example(s) of language that includes all the necessary features?
- trissima month ago
  Python is the canonical mainstream example: __init_subclass__/metaclasses (definition-time hooks) + __subclasses__(), mro(), __dict__ (introspection). CLOS (Common Lisp Object System) also qualifies: defclass/initialize-instance/defmethod hooks plus runtime introspection of classes, generic functions, and method combinations. Smalltalk similarly: class creation executes code, and everything is introspectable at runtime. Languages that lack definition-time hooks (Java, C#, Go, Rust, TypeScript/TS-era JS) or lack sufficient runtime introspection for structural facts don’t meet both requirements without extra tooling.
profchemaia month ago
Sorry for the caution but this is becoming very common. Lots of red flags for AI-Slop: Zenodo, single author work, lots of jargon, author has not had prior work in the field.
- trissima month ago
  This is also very common, appealing to authority rather than reading the proof. If there's an issue with the proof please show me where the issue is. I am glad to learn where I made a mistake. Just run "lake build".
trissima month ago
Addressed the criticisms raised here.
The main gap was real: language capability claims (Python can achieve SSOT, Rust cannot) were derived from string matching, not from formalized semantics. Fixed.
Proof chain now:
python_can_achieve_ssot uses python_has_hooks (a Prop, not a Bool) uses init_subclass_in_class_definition derived from execute_class_statement (modeled Python class definition semantics)
To attack this, you must either show Python code where init_subclass does not run at class definition time (empirically false), or find a bug in Lean.
For Rust, rust_lacks_introspection is now a 40 line proof by contradiction, not rfl. It assumes a hypothetical introspection function exists, uses erasure_destroys_source to show user-written and macro-expanded code produce identical RuntimeItems, then derives that any query would need to return two different sources for the same item. Contradiction.
On the "SSOT.lean is trivial" point: that file is scaffolding (38 lines). The substantive proofs are in Inconsistency.lean (225 lines, formalizes inconsistency as a Prop, proves dof_gt_one_implies_inconsistency_possible with constructive witness) and Coherence.lean (264 lines, proves determinate_truth_forces_ssot).
On "proofs are just rfl": many foundational proofs are definitional by design. When you model correctly, theorems become structural. But the new rust_lacks_introspection shows non-trivial reasoning exists where needed.
Updated stats: 9351 lines, 26 files, 541 theorems, zero sorry. lake build passes.
Remaining attack surfaces are model fidelity (show me Python code that contradicts the model) and interpretation gap (philosophy, not math). Both are inherent to any formal verification of real systems.
Common rebuttals already addressed in the paper:
"OpenAPI/Swagger achieves SSOT without hooks": Yes, because the spec file IS the single source and generated code is derived. That instantiates DOF=1, it does not contradict it. External tooling can always enforce consistency by being the source. Our claim is about what the language itself can enforce.
"Model doesn't mirror rustc internals verbatim": We model observable behavior, not compiler implementation. The claim is: at runtime, you cannot distinguish hand-written code from macro-generated code. Challenge: produce Rust code that recovers macro provenance at runtime without external metadata files.
"You just need discipline": Discipline is the human oracle. The theorem says: with DOF > 1, consistency requires an external oracle (human memory, documentation, review process). That is not a counterargument, it is the theorem restated.
"Real codebases don't need formal DOF guarantees": Whether you need it is engineering judgment. We prove what is logically required IF you want guaranteed consistency. Same interpretation gap exists for CAP theorem, Rice's theorem, Halting problem. Philosophy, not math.
Full rebuttals section in the paper addresses these and more.