2 pointsby santthosh015 hours ago1 comment
  • santthosh015 hours ago
    This week's theme across 6 papers: making AI agents less brittle.

    Highlights:

    * Memory-augmented RL agents remember and learn from past attempts simultaneously → 2x improvement on complex tasks, no retraining needed for new problems

    * Adding a code execution verification step before self-improvement training fixes the "models agreeing on wrong answers" problem

    * A new training method cuts harmful agentic behavior by 50% while keeping task completion intact

    * Meta iterated on their social chatbot 15 times using real Instagram/WhatsApp users — 19% more conversation depth

    Full summaries with "what it is / why it matters" breakdowns at the link.