2 pointsby paolovella19 hours ago1 comment
  • bettaher_adam9 hours ago
    The fail-closed approach is the right default. One thing I'd add to the attack classes you're considering: prompt injection via filesystem reads — an attacker can craft a file that, when read by the agent, injects instructions into the tool-call chain.

    We solved a similar boundary problem by signing all outputs with HMAC-SHA256 so downstream consumers can verify the response wasn't modified after the tool-call boundary. Not a replacement for your approach but complementary — input validation + output signing covers both ends.

    Is the MCPSEC benchmark public yet?