2 pointsby W_rey4518 days ago4 comments
  • W_rey4518 days ago
    Threat model / scope: This design assumes the signer’s private key is trusted at issuance time; it does not attempt to prove semantic correctness of the agent’s reasoning or inputs. The signature covers only the canonicalized signed_block; any mutation invalidates verification. Receipts are portable and verifiable offline but do not prevent a malicious issuer from signing false data (integrity primitive, not a truth oracle). Replay is detectable (e.g. via hash chaining or external indexing) but not prevented by the receipt alone. Confidentiality is out of scope; receipts are integrity-only artifacts. The goal is to make post-hoc tampering and log forgery detectable, not to replace policy enforcement or access control.
  • nulone18 days ago
    Solid primitive. Two questions:

    1. Crash edge case: If an agent executes a side-effect and dies before signing the receipt, is that action orphaned? Any WAL-style intent/completion model?

    2. Multi-step workflows: Do receipts chain natively (parent pointers/Merkle) or via external linking? (I see storage/ledgers are out of scope, but curious about the linkage design.)

    The negative proof angle (proving AI didn't touch prod) is compelling for compliance.

    • W_rey4518 days ago
      Great questions.

      1) Crash/orphan side-effect: agreed this needs an intent/commit model. The clean pattern is WAL-style: emit/sign an “INTENT” receipt before side-effect + a “COMMIT” receipt after; absence of COMMIT is itself evidence (“we can’t attest completion”). Another option is tool-level signing so the side-effecting tool returns a signed result that the agent includes.

      2) Linking: yes linkage can be native without building storage. Next iteration is parent_hash (or prev_hash) inside the signed_block so receipts naturally chain; Merkle/log indexing stays external.

  • kxbnb17 days ago
    The "signed transaction, not a log stream" framing is exactly right. Logs are optimistic - you assume they're complete and unmodified. Receipts are pessimistic - you verify before trusting.

    We've been thinking about a related problem at toran.sh: capturing what an agent actually sent to external APIs (and what came back) without trusting the agent's self-reported logs. Different angle - we focus on the API request/response level rather than the decision/action level - but the same underlying insight: the source of truth needs to be outside the agent's control.

    The Ed25519 + canonical JSON approach is clean. Question: how are you handling schema evolution? If the receipt format changes, older receipts still need to verify but newer tooling might expect different fields.

    • W_rey4516 days ago
      Great question this is exactly the tension we’re trying to be explicit about.

      CIRCE separates cryptographic verification from semantic interpretation. The signature covers a minimal, stable signed_block (canonicalized → hashed → signed). Everything else is metadata that can evolve without affecting verification.

      Older receipts remain verifiable because the verifier only assumes the signed scope + canonicalization rules. Newer tooling can understand more fields, but must ignore unknown/missing fields (JWT / signed artifact style). We also include a schema identifier/hash for tooling selection, but it’s intentionally not security-critical — verification is purely about integrity.

      Also: toran.sh’s angle is super aligned. Capturing actual API request/response outside the agent’s control feels like the “ground truth” complement to CIRCE’s “decision truth.” Curious: are you anchoring the API transcript via a sidecar/proxy with its own signing key, or are you doing something like a transparency log/Merkle chain for requests?

  • Agent_Builder18 days ago
    [dead]