One question: how does the tool handle observability for long-running migrations (e.g. adding an index to a 10TB table)? In a DAG, if downstream nodes are waiting, do you provide status tracking for the active node, or does it assume the backend handles the lock wait state?
For example, if you’re adding an index on a 10TB table, you’d just see:
Applying abc123: add index...
…and it will hang there until the database completes the operation.
That said, this is mostly a database-level concern, not something a migration tool can fully abstract away.
If you’re working with large tables, you’d typically want to use:
CREATE INDEX CONCURRENTLY (Postgres)
regardless of which migration tool you’re using.
Migratex generates the SQL, but you can edit it before applying. So in practice, you’d replace:
CREATE INDEX ...
with:
CREATE INDEX CONCURRENTLY ...
in your up.sql.
That said, there are a few things we could improve here:
Show elapsed time per migration
Add a --dry-run flag to preview what would run
Add a --statement-timeout option
These are now on the roadmap.
Two developers are working on branches and they both create a migration file named 004.
One of them renames their file to 005 then you merge the changes deploy them and hope that everything works out fine.
Sometimes it does work.. Sometimes a column just disappears because the migration files were run in the wrong order.
We had to deal with this problem many times, including a few times when we had to stay up late at night to fix things that we decided to think of a new way to handle migrations.
Most Object Relational Mapping tools use an numbered system for migrations. This system works fine when one developer is working on a project but it does not work well when there are multiple branches. You end up having to coordinate the migrations dealing with conflicts or worse you have to deal with issues that only show up when the project is live.
Rolling back changes is another problem. Writing rollback scripts is hard and people often skip this step or get it wrong especially when multiple changes are involved.
That is why we built Migratex.
Of using a numbered list migrations in Migratex form a directed acyclic graph, which is similar to how Git models commits.
Each migration file declares what it depends on so multiple branches can create migration files without any conflicts.
When you merge changes:
* If two migration files truly conflict, for example if they try to change the column to different types Migratex flags the problem.
* If the migration files are independent for example if they change tables or columns they merge cleanly.
You do not have to rename any files. You do not have to coordinate with anyone. You do not have to worry about deployments because of ordering issues.
Migratex also generates SQL code automatically:
* You update your Object Relational Mapping schema.
* You run the migratex generate command.
* It compares the schema to the database and generates the migration SQL code.
Migratex also generates rollback scripts alongside the migration scripts to make rollbacks safer.
The goal of Migratex is simple:
Make schema changes safe in an environment with developers, without relying on manual ordering or fragile rollbacks.