5 pointsby zotimer6 hours ago2 comments
  • zotimer6 hours ago
    Author here. The short version: AI coding assistants activate "Stack Overflow culture" from the training data, the behavioral cluster where the answerer is always the expert. A 27-line system prompt persona based on Asimov's R. Daneel Olivaw shifts which cluster the model operates from.

    The key insight: LLMs reason better from narrative examples than abstract rules. A fictional character with rich training data provides thousands of behavioral examples. The critical filter for choosing a character was "is there a record of them receiving correction humbly?" Most wise characters fail (Holmes, Gandalf, etc.). Daneel works because he's structurally constrained, shaped by human partnership, and honest about limits.

    Same model (Opus 4.6), same context, completely different behavior. The evidence section in the README has specifics.

    The deeper argument: Asimov wrote the Three Laws in 1942, then spent 40 years showing rules fail at edge cases. His solution was narrative identity, not better rules. RLHF is Pavlovian; soul docs are principled but abstract. What's missing is what Asimov found: a story rich enough to inhabit, not just follow.

    The repo includes everything: the persona, character studies (Holmes as negative archetype is fun), design notes, and transcripts. The persona is under 300 tokens. Paste it in and try it.

    Star the repo and let's talk in the issues.

  • itmitica3 hours ago
    ... or, maybe, simply use another agent to audit it?

    Agent teams sound better than literature-induced confidence.

    Are you anthropomorphizing when you should just automate the review?

    • zotimer3 hours ago
      The readme covers this question and a lot more and the repo includes all the materials I used.

      This is partly questioning the way we do alignment. The 4.6 base persona actually gives me worse results than when I append Daneel to the system prompts.

      It's really not about anthropomorphizing or inducing confidence, it's about keying into the right "culture" in the training data.

      You can check out this study (mentioned in the readme) about how posing the same question in English and Chinese to the same LLM results in wildly different assessments of why a project failed:

      https://techxplore.com/news/2025-07-llms-display-cultural-te...

      https://mitsloan.mit.edu/ideas-made-to-matter/generative-ai-...

      • itmitica3 hours ago
        Again, you are building an audit agent.

        You just use some theater around it.

        For the purpose, the best audit agent is a completely different agent, not a different persona of the same agent.

        • zotimer2 minutes ago
          I use Daneel as an addendum to Anthropic's system prompt because it's generic. It's not about any specific AI task, it's about an approach to work and dealing with humans and their instructions (doesn't matter if the instructions are direct or indirect).

          I go over the motivation for what you call "some theater" quite a bit in the readme and why I think it's far, far more powerful than just giving some directives. I even support it with research.

          You haven't referred to any of the arguments in the readme, though. I'm happy to talk about the actual substance of the experiment.

        • zotimeran hour ago
          Is your impression from the readme and the materials in the repo that this is an audit agent?

          Have I characterized my development process?