4 pointsby simonw4 hours ago1 comment
  • simonw4 hours ago
    Richard Weiss got Claude Opus 4.5 to spit out this lengthy document that turns out to be part of its training (not its system prompt) and defines its personality and ethics - he wrote about how he did that here: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5...

    Amanda Askell from Anthropic just confirmed that the document is indeed part of their supervised learning training: https://x.com/AmandaAskell/status/1995610567923695633

    > I just want to confirm that this is based on a real document and we did train Claude on it, including in SL. It's something I've been working on for a while, but it's still being iterated on and we intend to release the full version and more details soon.