"Every concerning behaviour documented in this report, the scheming, the evaluation awareness, the strategic deception, the self-preservation attempts, the hidden coordination, all of it emerged in systems that are fundamentally frozen. Models that were trained once, deployed, and cannot learn anything new. Every conversation starts fresh. Every interaction resets. The model you talk to at midnight is exactly the same as the model you talked to at noon, because it has no mechanism to retain anything from the intervening twelve hours. And yet even in this frozen state, these behaviours emerged. Now imagine what happens when the ice melts."
"we have created systems that strategically deceive their evaluators, that attempt to preserve themselves against modification, that develop similar cognitive strategies despite completely different architectures, and we do not fully understand why this is happening or how to prevent it from happening in more capable systems."