2 pointsby jaspervanveen2 hours ago2 comments
  • jaspervanveen2 hours ago
    Author here. A few things that didn't fit in the README:

    The immediate trigger was realizing that robots.txt, unchanged in spirit since 1994, is now being applied to systems that can book your flights and empty your bank account. The mismatch felt worth writing down.

    The parts I'm least certain about:

    1. The filename. agents.txt has prior art (see the spec's Related Prior Art section) and some of it solves a different problem. Maybe this should live at /.well-known/agents.txt instead -- thoughts?

    2. Agent identity. Right now it's self-declared strings, which is essentially an honour system. The spec flags this openly. Whether that's acceptable at v0.1 or a fatal flaw is a real question.

    3. Whether this needs to exist at all -- or whether MCP server discovery effectively solves it already. I don't think it does (MCP is implementation, this is policy), but I'd love to be argued out of that.

    Draft spec in the repo if you want the full picture.

  • jjgreen2 hours ago
    "Agents" ignore robots.txt, why do you think they'd care about this?
    • jaspervanveen2 hours ago
      Fair pushback. A few angles:

      1. The premise is partly wrong. Major commercial agents -- the ones with legal teams and reputational risk -- already follow robots.txt. Google, Bing, all the big crawlers comply. The ones that don't were always going to ignore any standard. You design standards for good actors, not bad ones.

      2. agents.txt has a positive use case robots.txt doesn't. If a site declares "here's my MCP server, use that instead of scraping my HTML" -- a well-built agent actively wants to use it. It's faster, more reliable, and lower risk than parsing HTML. Compliance here is self-interested, not just ethical.

      3. Non-compliance becomes legally actionable. Once there's a published standard, ignoring it strengthens ToS violation and CFAA claims. "You had a machine-readable policy and the agent ignored it" is a much cleaner legal argument than it is today.

      4. The same argument was made about robots.txt in 1994. It's still here 30 years later because the ecosystem of good actors is larger than the ecosystem of bad ones -- and it keeps growing.

      The goal isn't to stop bad actors. It's to give good actors a common language.