Instead, I am a fan of doing both: either committing the resulting schema of a migration, or hand writing it aside the migration. Then have tests to ensure the database schema matches the expected schema after a migration.
Generating these artifacts is fine, but in TFA's case there is no chance I wouldn't inspect and possibly modify the generated "diff" migration. It's significantly easier to go the other way: write the migration and show me the resulting schema diff.
When writing a migration, the resulting schema is usually much, much less important than the characteristics of the migration itself. When I review a migration, my first question isn’t “is this the right schema” but “is this migration going to bring downtime”. I’d much rather a smooth migration to an incorrect schema than having a critical table locked for minutes/hours.
I think that updates of stateful components should be imperative (explicit migrations), not declarative (implicit migrations). For example I don’t think Terraform is great tool to manage RDS: it doesn’t tell you the consequences of changing an attribute (database restart or other downtime-inducing stuff), I’d much rather I had to explicitly say how to get from state A to state B.
Similarly, I don’t think SQL migrations are perfect: they’re still declarative, you still need implicit knowledge to know if a migration will take a lock and what will be the consequences. I’d much rather have to code “take explicit lock; alter table xxx;”.
This tool probably allows editing migrations, but I don’t think it’s a step in the right direction. Maybe it’s a step towards databases being much better at migrations (so that we can rely on never having downtime), but even then I think it’ll get worse before it gets better
Modern diff tools are designed to provide better guardrails in these situations. For eg, pg-schema-diff [0] tries to generate zero downtime migrations by using lock-free migrations and warns you about potentially hazardous migrations.
I think it's good direction to bake these best practices into the tooling itself, rather than relying purely on the experiences of engineers.
Once a tool has those abilities, adding linters and destructive-action guardrails is much easier. Especially when compared to a traditional migration tool, which often has no underlying understanding of the ALTER TABLE operations being requested.
When implemented well, with appropriate guardrails and linters, it's perfectly safe. And it has many benefits over imperative schema migrations, such as a substantially better Git versioning flow, and ability to synchronize environments / solve schema drift natively.
The only major conceptual downsides are the inability to handle row data migrations, and inability to handle renames. These can be major hurdles for some smaller companies, but are more irrelevant at companies with large DB infrastructure, who require special-case handling for those operations anyway. In other words, if you have large tables, row data migrations are already substantially more complex than running a single UPDATE statement, and you can't do them using a traditional imperative migration tool anyway.
> there is no chance I wouldn't inspect and possibly modify the generated "diff" migration
Of course, you're generally supposed to do that with these tools. Database changes should always be reviewed carefully. Same with non-database-related infrastructure-as-code tools.
> Of course, you're generally supposed to do that with these tools
this seems to be the crux of the comments - we'll try to make it much clearer than declarative schemas don't skip any migrations/review process, they just provide another way of generating the migrations (note: you can also generate them directly from the database using the CLI)
Is it open source?
I mean, I see you say
> We then use a schema diff tool, like migra, to figure out the necessary updates to views and functions when generating the migration file.
But "like migra" is very nonspecific. I guess you are not using migra itself
I also object to the use of “declarative” here. Either we are talking about the schema as present in the database, then it’s neither declarative nor non-declarative, it’s just whatever is in the database. Or we are talking about a schema definition, and then I really don’t know what a non-declarative schema definition would look like, in contrast to what is called “declarative” here. Thirdly, the traditional “declarative” SQL schema definition is really a series of imperative SQL statements, so arguably not declarative.
What they seem to mean is a minimal sequence of statements that results in the desired schema, as opposed to a longer history of schema-altering statements. However, the minimal version is technically still a series of schema-altering statements, starting from (presumably) an empty schema.
In other words, the engineer is declaring "make my schema look like this" and the tool makes that happen. That's the key definition of a declarative system, consistent with how this is described in CS classes.
Meanwhile traditional migration tools are "imperative", as the engineer must tell the tool exactly what operations (e.g. ALTER) to be run.
Alter table foo add column bar;
Update foo set bar=baz;
Alter table foo modify column bar NOT NULL;
Inferring migrations isn't crazy. Automagically applying those migrations without review is crazy.
In my experience, declarative APIs are very powerful abstractions for specific cases where finding the path to the declared state is better left to a machine. This is seldom the case - in most cases, offering the programmer control over the changes leads to better behaviors.
Kubernetes and IaC tools lead the way to a declarative state of infrastructure and these add a ton of value. But, they were also incredibly hard to build - it took many years before Kubernetes eventing and control loop abstracts were rock solid. Most CRD-backed implementations suffer from tons and tons of bugs, and most CRDs are not declarative - they abstract away an imperative operation! I guess this is nothing new - "anything in excess is bad".
Anyways, I think an imperative approach offers much higher control and predictability at a lower cost. The world inherently is imperative.
I've been waiting for this top comment for longer than you can imagine. The declarative madness has always bothered me. Sometimes it's easier to maintain when you see the process. And harder to declare the final state. It might look neat, but maintainability beats neatness every day.
Regarding prior art: Django migrations are indeed declarative, and were very early in this space. But they're tied to Python model definitions in the ORM, which is a bit more of a special-case than the native SQL CREATE based approach described here.
As for Prisma Migrate, they directly copied several innovations from my tool Skeema [1] which has been available since 2016, so they can be equally accused of "reinventing" things :)
Not that I invented pure-SQL declarative schema management either, by any stretch. I was largely inspired by the workflow at Facebook, who adopted declarative table management company-wide back in ~2012. FB found that having a declarative reconciliation loop is an operational necessity with a massively sharded system, given some hardware just dies every day. And centralized SQL-based schema management is advantageous when applications are written in many different programming languages.
There are some existing declarative tools for Postgres and SQLite though, see https://news.ycombinator.com/item?id=43576450
In the old situation, you write CREATE TABLE statement at the start of the project. And when you add a feature, you have to write an ALTER TABLE script.
In this new situation, you just change the CREATE TABLE script. And Supabase uses migra to figure out the difference and it automatically alters the table.
What's interesting is that in your SQL code, there's no longer any difference between creating a new database, and updating an existing database.
I helped set this up in a fortune 500 company a few years ago. They were using a team of db engineers to execute manually written change scripts, with manual reviews, control processes, and deployment schedules. You'd be lucky if you got your schema change to prod in a month. When they started integrating this tool on some trial candidates they found SO many inconsistencies between environments: server settings differences, extraneous or missing indexes, vestigial "temp" tables created during previous migrations, enum tables that should be static with extra or missing rows, etc, etc. All the environment differences meant that deployments had to be manual in the past. Once they got through the initial pain of syncing up the environments the whole department got way more efficient.
This implies somebody with admin rights makes alterations in ad-hoc way without first doing it in test env.
If they continue with adhoc stuff, then it means auto-generated migrations will be different in test vs prod. (I prefer to test exactly same thing that will be used in prod)
Not necessarily. With a large team/org using the same database schema, it can just mean multiple people were trying to make changes to an environment around the same time, e.g. the migrations were applied in a different order in staging vs prod.
Some migration tools provide extra checks for strict ordering, but many do not. There's often no guarantee that the migration file naming scheme ordering, Git commit ordering, and actual DB apply ordering line up -- that's 3 different possible sources of truth, or more since the DB state varies by environment (dev/stage/prod etc).
Late-night hot-fixes (to solve an emergency outage) can be another source of inconsistencies / drift.
> If they continue with adhoc stuff, then it means auto auto-generated migrations will be different in test vs prod
That depends on the declarative tool and whether it fully syncs the schema each time, or just generates migrations which are frozen into a plan which is executed as-is in all environments. Not that full-sync is bad, but yes in that case it will generate different things in each env. Although the end result is that it will solve the drift, and give you the same end state in all environments. And that's likely what you want to happen: after running the tool, the database state will match the desired state which was expressed by the CREATE statements in your schema repo.
That said, the declarative tooling should have sufficient safety checks to ensure it doesn't do anything destructive in prod without very loudly telling you and requiring manual confirmation. That way, you won't be harmed when trying to synchronize an environment that had unexpected out-of-band changes.
That was exactly our experience too.
Perhaps we didn't highlight enough in the blog post that schema diff was not meant to replace manual review. It simply provided a good starting point for us to iterate on the migration, which often boosts efficiency.
This is how I've set it up at my current employer. It works well. We modeled it after the Terraform Plan/Apply steps, and double check that the script generated by the "apply" step matches the script generated by the "plan" step, since these can occur at significant temporal distances, and fail it if not, just so that we can be sure what we've read and approved matches what gets executed.
yeah - this is definitely the intended flow here. We won't be recommending anyone blindly applying generated migrations.
As you mention, it is expected that you generate & review on your local development machine, check into source control, push & merge. We've also been using this internally for ~2 years now and it works great
1/ Add a new column to the declarative file
2/ Generate a new migration: `supabase db diff -f my_new_migration`
3/ Review/edit the generated migration, check it into git
4/ Apply migration to database
Say I have a users table with a name column. Then I alter the table and split the name column into two new columns: first name and last name.
How is it possible to infer this change, just from seeing the new and old columns?
1. Migration that adds firstname and lastname columns will all nulls
2. Deploy application code change to start populating firstname and lastname alongside fullname, still reading fullname in the code.
3. backfill the firstname and lastname values with a script/command/migration
4. change app code to read firstname and lastname and stop writing fullname
5. drop the fullname column
I don't think there's a safe way to do all that in a single migration unless all your app code also lives in the database so it can be atomically deployed. If you have multiple app servers and do rolling deploys with no downtime I think it has to be done in these 5 steps. 6. ensure there are no nulls in firstname and lastname
7. alter the columns to be NOT NULL
Because no non-statistician uses nullable columns, right?Of course, some dbs (SQLServer?) infer NULL from the empty string, or am I misremembering?
Always having the columns be NOT NULL is a fundamental cheat, after always having a PK, or is that too old school for 2025?
Yes, it just requires extra care when querying and handling the rows.
It's always just easier, if you can, to make it NOT NULL after prepopulating all rows' columns to the empty string (or real data).
Sometimes NULL is truly different than the empty string, but that, like you said, is just a kind of data.
Side note, but why do folks always bring up this "firstname lastname" example? It is not ever possible to implement that correctly in an automated fashion: some people have spaces in their first names, and some people have spaces in their last names. (I don't mean to single you out, as this exact example comes up multiple times in every declarative vs imperative discussion!)
If I can, I even avoid storing actual names of users at all. If I have to, I use a simple free form "display_name" column; what is put there is up to the user.
I'm curious what makes it better than the approach in the blog?
If it's this:
> also maintains a schema.sql file that can be loaded as a one-off
That's exactly what the declarative files provide
SSDT can also sync db projects (nicely organized DDL .sql files representing the schema) and databases (one way or the other), with the IDE support you can do stuff like "find all references" on a column or any other DB object, and build the project to check for errors. Linting the schema becomes possible, etc I have a hard time when I have to go back to imperative schema management...
For sake of completeness for users of other databases: for SQLite check out stb-tester's migrator [5], and last but not least for MySQL/MariaDB there's my tool Skeema [6].
[1] https://github.com/stripe/pg-schema-diff
[2] https://github.com/bikeshedder/tusker
[3] https://github.com/djrobstep/migra
[4] https://github.com/sqldef/sqldef/
[5] https://david.rothlis.net/declarative-schema-migration-for-s...
I wasn't aware of sqldef I will need to take a look. Something nice about SSDT is that when we export a DB to a SQL project (files), each DB object is neatly organized is its own file, and the tooling solve ordering and dependency cycles on its own, I'm not sure any OS tool currently do it like that (skeema maybe? but no PG).
Skeema does indeed separate each DB object to its own file, and each logical schema to its own subdirectory. It also figures out how to correctly apply changes in a valid order, but yeah it's MySQL/MariaDB only. I think some of the declarative FOSS Postgres tools do attempt to provide a correct ordering, but from memory I'm not certain.
The thing you need to be doing is testing your migrations, and some dumbass on your team is going to generate the migration during CI and load it into your database as a merge step, and you won't realise what a mistake this was until possibly years later.
The good news, is you might be able to pay someone an enormous amount of money to unfuck things. Not good for you, I mean, obviously. Good for whoever you just bought a car.
1/ Add a new column to the declarative file
2/ Generate a new migration: `supabase db diff -f my_new_migration`
3/ Review/edit the generated migration
Docs are here:
https://supabase.com/docs/guides/local-development/declarati...
In the past we only offered tools to create a sql migration to make a change to your database. Now you can write the state of your database into files, then the migration is generated for you.
This is very similar to something like Rails/Phoenix/whatever, where you write your models and then it generates a change. The difference is that here you write your models in raw SQL, rather than an ORM
“Simple declarative schema migration for SQLite”
https://david.rothlis.net/declarative-schema-migration-for-s...
Discussed previously:
https://news.ycombinator.com/item?id=31249823
Disclosure: I am the co author of that article