34 pointsby obilgic10 hours ago21 comments
  • JasperNoboxdev23 minutes ago
    What I find interesting is that AI is surprisingly bad at writing agentic flows. We're setting up a lightweight system where the core logic lives in markdown files agent identity, rules, context, memory to create specific report outputs.

    Every time we ask the model to help build out this system, it tries to write a Python script instead. It cannot stop itself from reaching for "real code." The idea that the orchestration layer is just structured text that another LLM reads is somehow alien to it, even though that's literally how it works.

    It's like asking a native speaker to teach their language and they start building Duolingo

  • bigbezet6 hours ago
    A lot of devs, including me, have tried something similar already. I don't really find this approach reliable.

    Firstly, tools like Claude Code already have things like auto memory.

    Secondly, I think we all learned by now that agents will not always reliably follow instructions in the AGENTS.md file, especially as the size of the context increases. If we want to guarantee that something happens, we should use hooks.

    There are already solutions to track what the agent did and even summarising it without affecting the context window of the agent. Tools like Claude Code log activity in a way which is possible to analyze, so you can use tools to process that.

    When I tried something similar in the past, the agent would not really understand what is important to "memorise" in a KNOWLEDGE.md file, and would create a lot of bloat which I would then need to clean up anyway.

    There are existing tools to tell the agent what has happened recently: git. By looking at the commit messages and list of changed files, the agent usually gets most of the information it needs. If there are any very important decisions or learnings which are necessary for the agent to understand more, they should be written down >manually< by a developer, as I don't trust the agent to decide that.

    Also, there is an ongoing discussion about whether AGENTS.md files are even needed, or whether they should be kept to an absolute minimum. Despite what we all initially thought, those files can actually negatively affect the output, based on recent research.

  • sathish3163 hours ago
    This is surprisingly good, once you create multiple copies and use each copy as a specialized agent. Maybe we don't need OpenClaw just to manage email, calendar, slack, todo lists etc using natural language.

    Agent-kernel has personality, persistent memory, self-modifying capability, using Skills is same as using Skills from Claude code.

  • esperent7 hours ago
    > notes/ — Narrative. What happened each session — decisions, actions, open items. Append-only. Never modified after the day ends

    I already have to fight the agent constantly to prevent it adding backwards compatibility, workarounds, wrappers etc. for things that I changed or removed. If there's even one forgotten comment that references the old way, it'll write a whole converter system if I walk out of the room for 5 minutes, just in case we ever need it, and even though my agents file specifically says not to (YAGNI, all backwards compatibility must be cleared by me, no wrappers without my explicit approval, etc.). Having a log of the things we tried last month but which failed and got discarded sounds like a bad idea, unless it's specifically curated to say "list of things that failed: ...", which by definition, an append only log can't do.

    I have hit the situation where it discovered removed systems through git history, even. At least that's rare though.

  • aiboost8 hours ago
    Text files + Git > Vector DBs. Nice work. I'm curious about how this scales. As the notes/ directory grows over weeks or months, reading past daily logs will eat up the context window.
    • mrugge7 hours ago
      Could add a scheduled GitHub action that compacts long history into a vector db and then have the agents also check that vector db in addition to md and git history.
  • avereveard5 hours ago
    the one file that does the same (maybe code focused, easily adapted)

    ---

    Only documentation to write is project.md and TODO.md do not write documentation anywhere else.

    TODO.md: document gaps, tasks and progress, grouped by feature

    project.md: document architecture, responsability map, features and the tribal knowledge needed to find things

    Do not document code, method, classes.

    STANDARD OPERATING PROCEDURES:

    Beginning of task:

    - read: goals.md tech.md project.md

    - update TODO.md add step by step [ ] tasks under the # feature you will implement

    During execution of task:

    - perform the task step by step, delegate if possible to sub tasks or sub agents

    - log with [x] the work performed in TODO.md as you go

    End of task:

    - remove completed features from the TODO.md

    - maintain project.md

    • knollimar3 hours ago
      Does this not result in subagents not logging their work?
      • averevearda minute ago
        I prefer the orchestrator to have a say, based on the answer of the sub agent
  • _pdp_6 hours ago
    > AI agents already read AGENTS.md (or CLAUDE.md, .cursorrules, etc.) as project instructions. This kernel uses that mechanism to teach the agent how to remember.

    Dude, this is just prompts. It is as useful as asking claude code to write these files itself.

  • chrisdudek5 hours ago
    I've been working with an agent as a secretary for 3-4 weeks now. CLAUDE.md, daily journal, state file, pipeline tracking.

    bigbezet is right, agents have no clue what's worth remembering. What works for me is splitting it: the agent writes what happened, I decide what actually matters. Two places to manage: journal and the STATE.MD, which I request to maintain based on my expectations. Agent can read a journal if it needs, but the main place to check the status is STATE.md.

    One thing I haven't seen anyone mention, though. After a few weeks of reading your rants about some coworker, the agent just takes your side on everything. Had to literally add "consider the other person's perspective" to my rules file. It just has too many one-sided notes in the journal. Otherwise you end up with a yes-man that has perfect memory.

    The trauma replay thing gaigalas mentioned is real too. I found it hard to not make agent be biased. To be frank, even I'm noticing something like this: - I complain, agent defends me. - I'm putting into the chat a response from other llm which was not biased by my journal. It flips sides and now says the research makes much sense. - I say: "How much biased you are right now." and it responds something about being biased and "... to be frank, the truth is: ...". Even when asking for not being biased, is starts to play biased because it thinks I expect that. Sneaky bastard.

  • gaigalas6 hours ago
    It's curious when agents remember traumatic events and replay them instead of avoiding them.

    I was stuck on a task for a couple of days. Deleted the memory about some debugging sessions, thing just unlocked itself again. The harness was basically replaying the trauma over and over again.

    I honestly think it's better to not have stateful stuff when working with agents.

    • bavell3 hours ago
      I've found memories and state to be a mixed bag with LLMs. To the point I don't bother with long term memories - usually only short or medium-term session logs or task-focused docs.
    • jamiemallers2 hours ago
      [dead]
  • renewiltord6 hours ago
    Yeah, all the claw based agents use similar structure. I experimented a little with SQLite DB with embedding so it could vector search but I did not manage to get it to do better. Best is still to just stuff things in context and let it full text search history.
  • JasperNoboxdevan hour ago
    [dead]
  • jc-mythsan hour ago
    [dead]
  • michaelksaleme2 hours ago
    [dead]
  • BloodAndCode2 hours ago
    [dead]
  • maxbeech5 hours ago
    [dead]
  • miki_ships7 hours ago
    [dead]
  • munio4 hours ago
    [dead]
  • AKSaathwik5 hours ago
    [dead]
  • jee5995 hours ago
    [dead]
  • AlfaNest9 hours ago
    [dead]
  • SIMY_Tetsu6 hours ago
    [dead]