For example, if I write a bad AGENTS.md for a repo with 100 engineers actively working in it, then every agent for every engineer gets worse, without anyone really noticing.
I think we should move towards data-based tuning of AGENTS.md, testing out changes, gathering data, and then making a decision on whether or not to ship it.
- Keep it concise, use progressive disclosure / nested AGENTS.md for information expansion - Give agent the high level repo structure if necessary - Have a "why" section to align the agent, high level, what your code is doing - Keep behavior instructions positive where possible, eg Always clarify intent before acting
Anthropic has a new post series on enterprise adoption, their first one is on the setup and AGENTS.md gets a good chunk of that
I now have agents write more of that stuff but deeply review it. As peer commenter points out, a bad instruction can do damage. Keep them lean and clean, adjust them as new models arrive.