4 pointsby wenhan_zhou4 hours ago2 comments

wgd4 hours ago
I've always been amazed at how terrible most frontier LLMs are at compaction given how embarrassingly easy it is to come up with half a dozen different RL training evals which would teach models to generate useful context summaries. Heck, you could bolt it onto any existing RL eval by just forcing a compaction every three turns.
- wenhan_zhouan hour ago
  Yep. Or even better, compact after a random number of turns. The model must then learn to preserve useful context at arbitrary context lengths.
wenhan_zhou4 hours ago
If understanding emerges from pre-training, then perhaps memory is what emerges from post-training.