NO ONE, agent or human, should have direct write access to production databases outside of emergency break glass scenarios. This is why we have stored routines and API layers to pre-define what writes are allowed. The facts that agents CAN autonomously write to a database does not imply that they should.
For the point about query optimization, again your agents should not be issuing random queries against a production database. We have had the concept of separate analytics databases with different architectures to support exporatory queries for decades.
There are ways to give safe access to the data, at least read-only, that don't involve production risk and don't sacrifice privacy. For example, database branches with anonymization. Instead of accessing the prod/staging db, the agent creates a branch and has read/write access to that.
(disclaimer: I work at Xata, where we offer copy-on-write branches for Postgres, and the agent use-cases are the most popular right now)
Truly sensitive customer information is encrypted, and on an isolated DB cluster that no one has regular access to. I also operate with a read-only grant, because manual writes to a prod DB is generally a terrible idea.
I suppose that LLMs will be treated as a code artifact and liability will shift upstream towards who deployed/approved the access in the first place. Even though code is essentially deterministic, making that association fairly simple, it's going to boil down to this same paradigm.
Perhaps governance rules will evolve to even explicitly forbid it, but my gut feeling is that for what the future determines to be "practical" reasons (right or wrong) LLMs will warrant an entirely new set of rules to allow them to be in the chain at all.
+ EDIT: both my wife and I have experience in this area and the current answer is companies like KPMG don't have an answer yet. Existing rules do help (e.g. there better be good documented reasons why it was used and that access was appropriately scoped, etc), but there is enough ambiguity around these tools so they say "stay tuned, and take caution".
It's not news that if you just give all developers at a company write access to the production databases, owner permissions on all resources, etc. that velocity can be increased. But at what cost?
The reason we don't do that in most cases is that "move fast and break things" only makes sense for trivial, non-critical applications that don't have any real importance, like Facebook.
Sure don't give an LLM agent write access to the modeled CRM that took months/years to build.
But turning a spreadsheet into an app in a few days? By giving the LLM proper read/write capabilities for velocity? I think the case is there for it. Right tool for the right job.
This makes no sense to me. For anything that has sensitive payment or personally identifieable data, direct access to DB is potentially illegal.
> The failure/rollback model is much easier with these as long as we have good backup hygiene.
Have you actually operated systems like this in production? Even reverting to a DB state that is only seconds old can still lose hundreds or thousands of transactions. Which means loads of unhappy customers. More realistically, recovery points are often minutes or hours behind once you factor in detection, validation and operational overhead.
DB revert is for exceptional disaster recovery scenarios, not something you want in normal day-to-day operations. If you are saying that you want to give LLM full access to prod DB and then revert every time it makes a mistake, you aren't running a serious business.
I use an LLM to access my database occasionally, but never in production and never with write access. It is genuinely useful. It would never be useful in a production setting, though.
It's worth noting too that people should be wary of what a read only user means in database land. There are plenty of foot guns where writes can occur with read-like statements, and depending on the schema, maybe this would be a rollback-worthy situation. You really need to understand your database and schema before allowing an LLM anywhere near it, and you should be reviewing every query.
Go to any SMB out there and there's a goldmine of processes that could be improved with LLM agents with full RW access to a database. Where backups are sufficient as a recovery mechanism that is better-than-before.
If velocity means letting agents live edit a db, I'm fine being slow. Holy hell. Let these people crash and burn but definitely let me know the app name so I know never to use it first.
An LLM agent (with RW access to a DB), a developer, and a few days these become proper apps that SMB business would pay well for.
Sure don't give an LLM agent access to PII or properly built CRMs etc. But to not see the rest of the landscape seems like a missed opportunity.
I've done work for a hedge fund where the DB ran directly on the manager's desktop. I worked with my local copy and sent an update script, and he had a second copy he ran on to verify.
Even with humans you shouldn't be working directly against the prod DB in these cases!
For dev/prod staging though, there's that other story on HN right now of an LLM agent that maneuvered it's way to prod credentials and destroyed prod. And backups went along with it. I'm paranoid enough to think backups in this use case means out-of-band uncorrelated storage.
Surely in the Venn Diagram of things, there's a slot where it's okay let a Claude Opus agent run on a process with good backups/recovery? Where taking the risk of a 1-hour restore job is worth the LLM agent velocity?
For extra paranoia, surely even Opus/Mythos can't figure out how to destroy log level backups to immutable storage.
One can easily imagine an LLM-enabled database that lets a wider audience build meat-and-potatoes line-of-business apps for small team use with minimal compliance concerns.
I saw an article on HN one time about CSVs and how much business still flows through them. Reminds me of the xkcd comic about the one tiny block propping up lots of infrastructure. It stuck with me because it's ripe area for LLM agent based upgrades.
Sure don't give LLMs access to the well architected blocks. But not wanting to improve the brittle areas seems crazy to me even if it's contrarian.
If you're just vibe coding a tool for yourself, you don't have 'production database' at all even if you use database technology for storage. Just like many Android apps use local sqlite DBs but they're not production databases.
Of course in this case no traditional wisdom about production databases matters to you. In other words, it's off-topic.
Sure don't give an LLM agent write access to the modeled CRM that took months/years to build.
But turning a spreadsheet into an app in a few days? By giving the LLM proper read/write capabilities for velocity? I think the case is there for it. Right tool for the right job.
2) In regards to having good backup hygiene, who is we?
Examples for me are all the apps that live in a spreadsheet, or in a MS Access database. Or all the crappy ad backed apps on the iOS app store. People wipe full spreadsheets all the time and backups are the only recovery.
Just last weekend I was frustrated with the poor quality of Pokedex type apps that spam ads left and right. Took just one session with Claude Opus to roll a custom Pokedex. It knew internally about things like the PokeApi dataset, Pokemon data modelling etc. To-the-hour snapshots of the database are trivial for bespoke apps like this so the LLM agent velocity seems like an okay trade off for me.
Clearly people don't agree...
If you want to give your agents a DB for their own work as a scratchpad or something that’s great. They can not only go to town, but also analyze their own work and iterate on it.
If you are talking about a production base, agents should not be hitting it directly under any circumstances. There needs to be an API layer with defined usage patterns, rate limits, etc.
This is basically the same as saying “databases weren’t designed for interns to run live inline migrations in prod”. Yeah of course they aren’t.
And the same as saying "databases weren't designed for non-technical people to connect with report-building tools like Power BI and Excel and run reports in the middle of peak customer checkouts."
As a DBA, I'm constantly surprised by what people think will be completely harmless to hook up to the database server - and then how much havoc it causes. Gonna be a rough decade.
I work with a team that does stuff like this, returning a 200 and a body containing "error: I didn't do what you said because _insert error here_"
The problem is that you returned OK instead of ERROR when things were not OK and there was an ERROR.
Its a design that smells of teams trying to hit some kind of internal metrics by slightly deceptive means.
I've seen this approach before, it mostly follows from using the code to signal application errors (200 + ok/error) from other kinds of errors that might arise.
Where do these "application errors" occur if neither on a server nor a client?
I think the reality is that management sees "5xx means server error, so our team's KPI is now server error rate, the lower the better!" Then the team just stops using 500 errors as much as possible. They probably justify it with things like "well, such and such problem isn't our fault so its not really a server error." This kind of thinking is perverting the intent of 5xx messages. They are supposed to indicate any failure to handle the request that happens on the server, NOT measure whether the dev team is making a good application.
I don't know about others, I know about this one because I had to dig into a bug where something on live looked like it succeeded but didn't, while the error worked fine on dev. Ended up downloading the apache source and finding where it was happening before just using a 200 response.
Executives who wouldn’t take the time to build a report are happy to ask an AI agent to do so.
Another way to bring prod down even with read is depending on your atomicity settings, try starting a transaction and don’t commit or abort it, just leave it dangling. That’s a cute one
I jokingly tell students, "We all know executives are gonna make bad decisions no matter what the data says. Might as well give them the random numbers more quickly."
I recently encountered `is_as BOOL` in an important table. After way too much invested time we found out it meant "is active service". </DDL rant>
One of the worst has field names like `ft_0001...N` and table names like `UNCC_00001...N`, all in `text` fields (even numbers!), zero FK, almost no indexes and what are views?
The other has this funny field that is a blob that need decoding using a specific FreePascal version. The field? Where is the price of the product.
Other has, in the same column, mix of how handling "," or "." for numbers and I need to check the digital places to deduce which.
FUN.
P.D: I normalize all this Erps into my own schema and has get praise for things like, my product table is called products.
Wrote a Windows .Net program once upon a time to convert the data from other financial CRM systems into the system I worked on. Built a data mapping tool as no customer we onboarded placed "custom" data in the same tables or fields even when using the same financial system.
I actually miss doing that kind of work, my brain seems to be wired to find it fun. Writing SQL is one thing I don't delegate to an AI or even an ORM like Doctrine.
Why I not allowed to do things directly? For the same reasons the tables are like that
And so you got tables like LANDMRK and columns like RCR_RCRDR.
Sounds like a table designed by Forrest Gump.
Operational databases store transactions and support day-to-day application workflows.
For analysis, data is often copied into separate analytical databases (data warehouses), which are structured for efficient querying and large-scale data processing. These systems are designed to handle complex, random queries and heavy workloads.
LLM agents are the best way to analyze data stored in these databases. This is the future.
Why, and how?
Based on my experience with Claude, it's pretty damn good at doing data analysis, if given the right curated data models. You still need to eyeball the generated SQL to make sure it makes sense.
> and how?
1. Replicate your Postgres into Snowflake/Databricks/ClickHouse/etc, or directly to Iceberg and hook it up to Snowflake/Databricks/ClickHouse/etc.
2. Give your agent read access to query it.
3. Build dimensional models (facts and dimensions tables) from the raw data. You can ask LLM for help here, Claude is pretty good at designing data models in my experience.
4. Start asking your agent questions about your data.
Keep steps 3-4 as a tight feedback loop. Every time your agent hallucinates or struggle to answer your questions, improve the model.
Side note: I'm currently building a platform that does all 3 (though you still need to do 2 yourself), you just need Postgres + 1 command to set it up: https://polynya.dev/
Yesterday, Claude decided to go with nvarchar(100) for an IP address column instead of varbinary(16), and thinks RBAR triggers are just-as-good as temporal tables.
So, no. Claude is not good at designing data models in my experience.
inet —- allows zero bits to the right of the netmask
cider —- does not allow- Read access through OLAP, not OLTP. You just need to setup a near real-time replication between your OLTP and OLAP.
- Write access through API, just like your application. You can add fancy things like approval layer, e.g. you agent cannot "ban_user(id)", but it can "request_to_ban_user(id)", and the action only happens once you approve it.
Before redesigning your database, consider seeing a psychiatrist.
> An agent working on a customer analytics task might issue a join across five tables that has never been issued before, hold the connection while it thinks about the result, then issue a completely different follow-up.
Aren't agents simply automated human persona's? Like, why would an Agent make a join across five tables given that its reasoning is grounded on human-aligned decisions. For eg.
If the agent knows the schema, and is aware of the indexes defined. Then, shouldn't it only work its way through those 'known paths'. Why would you allow the agents to deviate known paths?
This is one of those takes that is so close to understanding the problem, and then drawing an insane conclusion.
The problem is that AI agents and the code they output is untrustworthy, buggy, insecure, and lacking in any of the standards the industry has developed over the last 30 years. The solution to this is "don't use AI agents", not "change the rest of the stack to accommodate garbage".
Also, the DB will most certainly not silently ignore a unique constraint violation: it will send an error back. EDIT: unless you’re using INSERT OR IGNORE, of course.
How you even enforce it ?
And why you are even giving agent access to live DB in the first place ?
DENY DELETE TO [agent] ON DATABASE::current;
(yes, the recommended way would be to simply grant only SELECT,INSERT,UPDATE but if I were the DBA here I would definitely put in place an explicit deny)This doesn't make sense, in the context of the author's chosen example (postgres). Postgres connections are very heavy and there is a huge performance penalty for cycling them quickly, and a whole range of silly workarounds for this fact (pgbouncer). Maybe the author meant to say that sessions are brief.
probably not, maybe only for analytical (OLAP) purposes in read-only mode.
for transactional OLTP loads, it is better to use Kafka style durable queues, have agents create a change record to mutate the state, but not the projection itself, which could be recomputed at arbitrary point in time via time-travel mechanism, could be branched out into different versions, etc
This assumption is that of a non-DBA who happened to get a hold of a database.
When you have sufficient users, your expected set of queries is a complete shit show. Some will be efficient, many will be poorly tested and psychotic, and indistinguishable from a non-deterministic LLM.
Also who said humans can’t query the database directly in prod? If not direct sql access, business users have the next best thing with custom reports and such. And they will very much ask for ridiculous amounts of computation to answer trivial questions.
It was a foundational assumption of SQL that business users would directly access the database and write their own queries.
It’s why row level access and permissions exist. Use them
I’ve always baked important invariants directly into the database with constraints and triggers. Maybe this is because I work on internal apps, where the data is more important than the presentation. Maybe it’s from my functional programming experience and some need to make invalid states unrepresentable.
Regardless, I believe that the data layer should be the most carefully designed part of an app.
I'm doing some research on this topic at the moment, along with many other people on the database team at Oracle. The article is written from a Postgres perspective. Most of these problems don't surface with a better database engine and you don't need to do any work to solve them. For people also interested in this topic, here's some notes on how things line up:
• Statement timeouts: an indirect hack. What you actually want are resource usage limits like CPU wall clock time excluding lock waits. That's "CREATE PROFILE" in Oracle, or if you want more power (e.g. RAM / IOP limits too) there's https://docs.oracle.com/en/database/oracle/oracle-database/2...
• Writes: This boils down to the user might make a mistake and yes, sure, no database can magically stop all mistakes. The usual fix here is to define stored procedures that provide a set of safe write operations. Same in all DBs.
• Soft deletes: You can turn on Flashback (time travelling SQL) and then just use regular DELETEs. You can also undo transactions in some cases in Oracle even if other transactions happened afterwards as the DB can generate SQL for you that attempts to undo the effects.
• Idempotency keys: Built in if you turn on Transaction Guard. https://docs.oracle.com/en/database/oracle/oracle-database/2...
• Assumption that connections are brief: Oracle has built in server-side DB pooling and horizontal scalability, so separate 'bouncers' aren't needed. You can just let agents connect directly without needing special infrastructure.
• Observability: you can associate metadata with connections that are then associated with recorded queries so agent/step can be looked up given a query. You can then find these queries if they're holding locks.
• Schema is a contract with eng: Well named schemas with comments are a good practice in any DB. Oracle views are automatically writeable in many cases (i.e. the SELECT query is inverted), so if your schemas are messy you can use views to clean them up and those views are still usable to make changes to the data if you need to.
• Scoping blast radius: Lots of security features in the DB to do this, as well as things not typically considered security like transaction priority levels to support lock breaking (run agents at low priority and app server writes will kill agent transactions). And you can easily make DB CoW snapshots.
So once your DB handles all the basics like that, the interesting things remaining are really all the semantics that are encoded into the application layer that the database doesn't see.
Pointing agents (or humans) at a live database to write arbitrary SQL might sound scary but it's the exact use case databases have been designed for from the beginning, and at least some of them have lots of features designed to make this as safe as possible. It can be that even more features are needed - that's the sort of question I'm currently exploring. But the foundation is there, at least.
It’s good idea to be defensive, design the system in a way that it can “fix” itself.
But for love of god, don’t let an LLM do everything it wants.