1 pointby dreynow5 hours ago1 comment
  • dreynow5 hours ago
    I've been building multi-agent systems and kept running into the same problem: every framework trusts agents by default. There's no identity verification, no way to learn which agent performs well on which task, and no mechanism to revoke permissions when something underperforms. You only find out something went wrong after reading traces.

    I built Agent Trust to explore a different approach: treat agents like services with identity, permissions, and track records. It's a Python SDK (MIT, 135 tests) that adds several layers:

    Identity – Each agent gets an Ed25519 keypair and a DID (did:agent:). Every action is cryptographically signed.

    Delegation – Scoped permissions with caveats and expiry. An agent can only do what it's been explicitly allowed to do.

    Reputation – Computed from verified outcomes, not self-reported metrics.

    Routing – UCB-based selection picks agents based on past performance, balancing exploration and exploitation.

    Enforcement – Permissions can be restricted or revoked at runtime. Cryptographic, not advisory.

    pip install kanoniv-trust

    from agent_trust import TrustAgent

    trust = TrustAgent() # SQLite, zero setup

    trust.register("researcher", capabilities=["search", "analyze"]) trust.register("writer", capabilities=["draft", "edit"])

    trust.delegate("researcher", scopes=["search"], expires_in=3600) trust.delegate("writer", scopes=["draft", "edit"])

    trust.observe("researcher", action="search", result="success", reward=0.9)

    trust.authorized("researcher", "search") # True trust.authorized("researcher", "analyze") # False — not delegated

    best = trust.select(["researcher", "writer"], action="search")

    The part I've found most useful: agents can read their own verified track record before acting.

    ctx = trust.recall("researcher") print(ctx.guidance) "researcher excels at search (95% success). Weaknesses: none observed. Recommendation: High confidence for search tasks."

    This injects a summary of past outcomes — success rate, strengths, weaknesses — directly into the prompt. Agents adapt based on actual performance without retraining. It's a simple form of in-context reinforcement learning, and it's the thing that surprised me most: agents genuinely behave differently when they can see their own track record.

    There's also a dashboard (Observatory) for visualizing reputation scores and delegation graphs, and integrations for LangChain and CrewAI.

    I looked at Langfuse, AgentOps, and similar tools — they're good at tracing but stop at observation. This tries to close the loop: identity, verified history, and decision-making in one system.

    It's early but usable today. I'd especially appreciate feedback from people running multi-agent systems in production.

    GitHub: https://github.com/kanoniv/agent-trust