{
"sandbox": {
"enabled": true,
"filesystem": {
"allowRead": ["."],
"denyRead": ["~/"],
"allowWrite": ["."],
"denyWrite": ["/"]
}
}
}
You can change the read part if you're ok with it reading outside. This feature was only added 10 days ago fwiw but it's great and pretty much this.It works well. Git rm is still allowed.
Nowadays I only run Claude in Plan mode, so it doesn’t ask me for permissions any more.
Escaping it is something that does not take too much effort. If you have ptrace, you can escape without privileges.
People might genuinely want some other software to do the sandboxing. Something other than the fox.
Docker sandboxes use microvms (i.e. hardware level isolation)
Bubblewrap uses the same technology as containers
I am unsure about seatbelt.
For example:
Bash(swift build 2>&1 | tail -20)
⎿ warning:
/Users/enduser/Library/org.swift.swiftpm/configuration is not accessible or not writable, disabling user-level cache
features. warning: /Users/enduser/Library/org.swift.swiftpm/security is not accessible or not writable, disabling user-level cache feat
… +26 lines (ctrl+o to expand)
Build hit sandbox restriction. Retrying outside sandbox.Bash(swift build 2>&1 | tail -20)
⎿ [35/52] Compiling MCP Resources.swift
[36/52] Emitting module MCP
[37/52] Compiling MCP Client.swift
… +17 lines (ctrl+o to expand)
⎿ (timeout 3m) Configure Overrides:
1. Allow unsandboxed fallback
2. Strict sandbox mode (current)
Allow unsandboxed fallback: When a command fails due to sandbox restrictions, Claude can retry with dangerouslyDisableSandbox to run outside the sandbox (falling back to
default permissions).
Strict sandbox mode: All bash commands invoked by the model must run in the sandbox unless they are explicitly listed in excludedCommands.https://code.claude.com/docs/en/sandboxing
(I have no idea why that isn't the default because otherwise the sandbox is nearly pointless and gives a false sense of security. In any case, I prefer to start Claude in a sandbox already than trust its implementation.)
[0] Fun to see the confusing docs I wrote show up more or less verbatim on Claude’s docs.
[1] My config is here, may be useful to someone: https://github.com/carderne/pi-sandbox/blob/main/sandbox.jso...
e.g. if it writes a script or program with a bug which affects other files, will this prevent it from deleting or overwriting them?
What about if the user runs a program the agent wrote?
The latter could end like this https://news.ycombinator.com/item?id=47357042
We've been securing our systems in all ways possible for decades and then one day just said: oh hello unpredictable, unreliable, Turing-complete software that can exfiltrate and corrupt data in infinite unknown ways -- here's the keys, go wild.
And nothing big has happened despite all the risks and problems that came up with it. People keep chasing speed and convenience, because most things don’t even last long enough to ever see a problem.
Industry caught on quick though.
/rant
It looks both more convenient and slightly more secure than my solution, which is that I just give them a separate user.
Agents can nuke the "agent" homedir but cannot read or write mine.
I did put my own user in the agent group, so that I can read and write the agent homedir.
It's a little fiddly though (sometimes the wrong permissions get set, so I have a script that fixes it), and keeping track of which user a terminal is running as is a bit annoying and error prone.
---
But the best solution I found is "just give it a laptop." Completely forget OS and software solutions, and just get a separate machine!
That's more convenient than switching users, and also "physically on another machine" is hard to beat in terms of security :)
It's analogous to the mac mini thing, except that old ThinkPads are pretty cheap. (I got this one for $50!)
Also. Agents are very good at hacking “security penetration testing”, so “separate user” would not give me enough confidence against malicious context.
> jai itself was hand implemented by a Stanford computer science professor with decades of C++ and Unix/linux experience. (https://jai.scs.stanford.edu/faq.html#was-jai-written-by-an-...)
The web site is... let's say not in a million years what I would have imagined for a little CLI sandboxing tool. I literally laughed out loud when claude pooped it out, but decided to keep, in part ironically but also since I don't know how to design a landing page myself. I should say that I edited content on the docs part of the web site to remove any inaccuracies, so the content should be valid.
Kinda reminds me of this: https://m.xkcd.com/932/
I'm not a web UI guy either, and I am so, so happy to let an AI create a nice looking one for me. I did so just today, and man it was fast and good. I'll check it for accuracy someday...
You need to rewrite all the text and Telde it with text YOU would actually write, since I doubt you would write in that style.
To your actual point, the people that would take the landing page being written by an LLM negatively tend to be able to evaluate the project on its true merits, while another substantial portion of the demographic for this tool would actually take that (unfortunately, imo) as a positive signal.
Lastly, given the care taken for the docs, it’s pretty likely that any real issues with the language have been caught and changed.
No they don't. The text is very clearly conveying what this project is about. Not everyone needs to cater to weirdos who are obsessed with policing how other people use LLM.
Author would, indeed, be wise to rewrite all the text appearing on the front page with text that he wrote himself.
the scs.stanford.edu domain and stanford-scs github should help with that.
David has done some great work and some funny work. Sometimes both.
So couldn't this be done with an appropriate shell alias - at least under linux.
I like the tradeoff offered: full access to the current directory, read-only access to the rest, copy-on-write for the home directory. With stricter modes to (presumably) protect against data exfiltration too. It really feels like it should be the default for agent systems.
Run <ai tool of your choice> under its own user account via ssh. Bind mount project directories into its home directory when you want it to be able to read them. Mount command looks like
sudo mkdir /home/<ai-user>/<dir-name>
sudo mount --bind <dir to mount> --map-groups $(id -g <user>):$(id -g <ai-user>):1 --map-users $(id -u <user>):$(id -u <ai-user>):1 /home/<ai-user>/<dir-name>
I particularly use this with vscode's ssh remotes.I've been using claude code daily for months and the worst thing that happened wasnt a wipe(yet). It needed to save an svg file so it created a /public/blog/ folder. Which meant Apache started serving that real directory instead of routing /blog. My blog just 404'd and I spent like an hour debugging before I figured it out. Nothing got deleted and it's not a permission problem, the agent just put a file in a place that made sense to it.
jai would help with the rm -rf cases for sure but this kind of thing is harder to catch because its not a permissions problem, the agent just doesn't know what a web server is.
I had originally thought this would ok as we could review everything in the git diff. But, it later occurred to me that there are all kinds of files that the agent could write to that I'd end up executing, as the developer, outside the sandbox. Every .pyc file for instance, files in .venv , .git hook files.
ChatGPT[1] confirms the underlying exploit vectors and also that there isn't much discussion of them in the context of agent sandboxing tools.
My conclusion from that is the only truly safe sandboxing technique would be one that transfers files from the sandbox to the dev's machine through some kind of git patch or similar. I.e. the file can only transfer if it's in version control and, therefore presumably, has been reviewed by the dev before transfer outside the sandbox.
I'd really like to see people talking more about this. The solution isn't that hard, keep CWD as an overlay and transfer in-container modified files through a proxy of some kind that filters out any file not in git and maybe some that are but are known to be potentially dangerous (bin files). Obviously, there would need to be some kind of configuration option here.
1: https://chatgpt.com/share/69c3ec10-0e40-832a-b905-31736d8a34...
You can already make CWD an overlay with "jai -D". The tricky part is how to merge the changes back into your main working directory.
I don't think the file sync is actually that hard. Famous last words though. :)
I want AI to have full and unrestricted access to the OS. I don't want to babysit it and approve every command. Everything that is on that VM is a fair game and the VM image is backed up regularly from outside.
This is the only way.
Please release binaries if you're making a utility :(
It does something very simple, and it’s a POSIX shell script. Works on Linux and macOS. Uses docker to sandbox using bind mount
I created https://github.com/jrz/container-shell which basically launches a persistent interactive shell using docker, chrooted to the CWD
CWD is bind mounted so the rest is simply not visible and you can still install anything you want.
Ignoring the confidentiality arguments posed here, I can’t help to think about snapshotting filesystems in this context. Wouldn’t something like ZFS be an obvious solution to an agent deleting or wildly changing files? That wouldn’t protect against all issue the authors are trying to address, but it seems like an easy safeguard against some of the problems people face with agents.
I wonder if shitty looking websites and unambitious grammar will become how we prove we are human soon.
More seriously, I'm not a heavy agent user, but I just create a user account for the agent with none of my own files or ssh keys or anything like that. Hopefully that's safe enough? I guess the risk is that it figures out a local privilege escalation exploit...
P.S. Everything old is new again <3
For devops work, etc (like your use case), I much prefer talking to it and letting it guide me into fixing the issue. Mostly because after that I really understand what the issue was and can fix it myself in the future.
You simply tell it to install that Docker image on your NAS like normal, but when it needs to login to SSH it prompts for fingerprint. The agent never gets access to your SSH key.
Not remotely worth it.
Still if you yolo online access and give it cred or access to tools that are authenticated there can still be dragons.
It's like people think that because containers and VMs exist, they are probably going to be using them when a problem happens. But then you are working in your own home directory, you get some compiler error or something that looks like a pain to decipher, and the urge just to fire up claude or codex right then and there to get a quick answer is overwhelming. Empirically, very few people fire up the container at that point, whereas "jai claude" or "jai -D claude" is simple enough to type, and basically works as well as plain claude so you don't have to think about it.
Use it! :) https://code.claude.com/docs/en/sandboxing
> bubblewrap is more flexible and works without root. jai is more opinionated and requires far less ceremony for the common case. The 15-flag bwrap invocation that turns into a wrapper script is exactly the friction jai is designed to remove.
Plus some other comparisons, check the page
With all the supply chain issues these days onboarding new tools carries extra risks. So, question is if it's worth it.
You have no excuse for "it deleted 15 years of photos, gone, forever."
When is HN gonna get a rule against AI/generated slop? Can’t come soon enough.
The name jai is very taken[1]... names matter.
[1]: https://en.wikipedia.org/wiki/Jai_(programming_language)