(and I also expand some env vars on fetch() requests for APIs that don't have hard IAM/Entra ID auth)
The keychain tool has the same semantics whether we're running solo, locally, in a container, anything. The agent doesn't know anything except the handles for the keys it has access to, whether they come from encrypted SQLite locally or from the Azure Key Vault via REST. It can't tell the difference, and different agents on different K8s containers (or other IAM entities) see different things depending on their key vault access.
It's literally 100 lines of Bun Typescript (150 for the cloud version).
And believe me, you don't want to reinvent IAM in your keychain/secrets management. Let the provider do it for you, that's what they are there for.
On the other hand, be careful what platforms you lock yourself into. I'm not saying "don't do it", just carefully evaluate all trade-offs by doing that. Turns out letting someone else handle your auth wholesale isn't always worth it long-term, but again, very "case by case" situation.
Cast is a harness for multi-user, multi-agent systems: one server, a handful of people with their own identities, a fleet of agents handling different things and talking to each other when they need to. Agents are skills and CLAUDE.md, not Python classes, so you can focus on launching quick and refining the agent based on real usage. MIT, self-hosted, runs on a Mac Mini.
Cast puts access control in the routing layer, not the prompt. Each agent runs in its own container with actual filesystem boundaries. Identity verified before the agent sees the conversation (Slack, telegram, etc). Credentials never mounted in.
Developer alpha. Looking for teams that have hit the multi-user Claude Code wall and want to try this out. github.com/yaodub/cast. MIT. BYO Claude key.
What exactly do you mean with this? The times I've collaborated on projects where most of us are using agents, we basically placed shared files in shared repositories, just like you usually do, so any shared instructions would go there. Then you work on your thing, then eventually submit a PR, and so on. Where does the "duct-taping row-level access" come into play, and how does it relate to the prompts themselves?
> MIT, self-hosted, runs on a Mac Mini.
Interesting approach to write something specifically for macOS and specifically for a Mac Mini :) I'm assuming this actually runs on whatever that can run JavaScript, right? :)
I built cast for other (non-coding) scenarios. A shared agent that multiple people interact with conversationally in real time, with different permission levels.
Think a household assistant on Telegram, or a small team's internal tool where sales and engineering collaborate but shouldn't see each other's data. There's no PR workflow there, just people chatting with a shared service.
On Mac Mini: Runs on anything with Node and a container runtime. Just trying to tap into the zeigeist.
Right, but wouldn't that happen by default? Lets say I slap a PHP API in front of a local Codex instance running somewhere, then let people login and chat with those, then by default nothing is shared? Sharing stuff between, is extra stuff on top, not things that happen by default, so I'm still not sure what the "duct-taping row-level access into the prompt" actually means in practice? You mean people would ask to access other's data and you want to prevent them from that?