It’s not just nice for monorepos. It makes both reviewing and working on long-running feature projects so much nicer. It encourages smaller PRs or diffs so that reviews are quick and easy to do in between builds (whereas long pull requests take a big chunk of time).
It wasn't the Mercurial team saying it was faster than Git; that was Facebook after contributing a bunch of patches after testing Mercurial on their very large mono-repo in 2014 [1]:
For our repository, enabling Watchman integration has made Mercurial’s status command more than 5x faster than Git’s status command. Other commands that look for changed files–like diff, update, and commit—also became faster.
In fact they liked Mercurial so much they essentially cloned it to create their own dvcs, Sapling [2]. (An aside: Facebook did all of this because it was taking too long getting new engineers up to speed with Git. Shocker.)
Today, most of the core of Mercurial has been rewritten in Rust; when Facebook did their testing, Mercurial was nearly 100% Python. That's where the "Mercurial is slow" thing came from; launching a large Python 2.x app took a while back in the day.
I was messing with an old Mercurial repo recently… it was like a breath of fresh air. If I can push to GitHub using Mercurial… sign me up.
[1]: https://engineering.fb.com/2014/01/07/core-infra/scaling-mer...
I was never a fan of hg either, but now I can use jj, and get some of those benefits without actually using it directly.
Fun story: I don't really know what Microsoft's server-side infra looked like when they migrated the OS repo to git (which, contrary to the name, contains more than just stuff related to the Windows OS), but after a few years they started to hit some object scaling limitations where the easiest solution was to just freeze the "os" repo and roll everyone over to "os2".
I guess what's old is new again.
I have fuzzy memories on reading about it.
The problem was I think something to do with like the number of git objects that it was scaling to causing crazy server load or something. I don't remember the technical details, but definitely something involving the scale of git objects.
I think what happened is Google bought a license for source code and customised it.
That makes sense because vanilla Perforce is unbearably slow and impossible to scale.
The efforts to sell priest robes to fruit vendors were a little silly, but I'm glad they didn't catch on because if they had caught on they no longer would have been silly.
It worked just fine 99% of the time and then 1% it became completely unusable.
Git is super mid. It’s a shame that Git and GitHub are so dominant that VCS tooling has stagnated. It could be so so so much better!
IOW, what do you know that nobody else does?
You can visit any resource about git and branches will have a prominent role. Git is very good at branches. Mercurial fans will counter by explaining one of the several different branching options it has available and how it is better than the one git has. They may very well be right. It also doesn't matter, because the fact that there's a discussion about what branching method to use really just means Mercurial doesn't solve branches. For close to 20 years the Mercurial website contained a guide that explained only how to have "branches" by having multiple copies of the repository on your system. It looks like the website has now been updated: it doesn't have any explanation about branches at all that I can find. Instead it links to several different external resources that don't focus on branches either. One of them mentions "topic", introduced in 2015. Maybe that's the answer to Git's branching model. I don't care enough to look into it. By 2015 Git had long since won.
Mercurial is a cool toolbox of stuff. Some of them are almost certainly better than git. It's not a better product.
I think there are (or perhaps were) some product issues regarding the specifics of various workflows. But at least some of that is simply the inertia of entrenched workflows and where there are actual downsides the (IMO substantial) advantages need to be properly weighed against them.
Personally I think it just comes down to the status quo. Git is popular because it's popular, not because it's noticably superior.
Google and Meta don’t use Git and GitHub. Sapling and Phabricator much much better (when supported by a massive internal team)
I personally went from .latest.latest.latest.use.this (naming versions as latest) to tortoise SVN (which I struggled with) to Git (which I also was one of those "walk around with a few memorised commands" people that don't actually know how to use it) to reading the fine manual (well 2.5 chapters of it) to being an evangalist.
I've tried Mercurial, and, frankly, it was just as black magic as Git was to me.
That's network effects.
But my counter is - I've not found Mercurial to be any better, not at all.
I have made multiple attempts to use it, but it's just not doing what I want.
And that's why I'm asking, is it any better, or not.
The thing is, to understand which one is actually better, you would have to give the same amount of investment in the second tool, which is not something most people are willing to do if the first tool is "good enough". That's how Python became the default programming language; people don't miss features they do not understand.
But what I will point out, for better or worse, people are now looking at LLMs as Git masters, which is effectively making the LLM the UI which is going to have the effect of removing any assumed advantage of whichever is the "superior" UX
I do wish to make absolutely clear that I personally am not yet ready to completely delegate VCS work to LLMs - as I have pointed out I have what I like to think of as an advanced understanding of the tools, which affords me the luxury of not having an LLM shoot me in the foot, that is soley reserved as my own doing :)
Are we back to "programming language X is slow" assertions? I thought those had died long ago.
Better algorithms win over 'better' programming languages every single time. Git is really simple and efficient. You could reimplement it in Python and I doubt it would see any significant slowness. Heck, git was originally implemented as a handful of low level binaries stitched together with shell scripts.
Python is absurdly slow - every method call is a string dict lookup (slots are way underused), everything is all dicts all the time, the bytecode doesn't specialize at all to observed types, it is a uniquely horrible slow language.
I love it, but python is almost uniquely a slow language.
Algorithms matter, but if you have good algorithms, or you're already linear time and just have a ton of data, rewriting something from a single-threaded Python program to a multithreaded rust program I've seen 500x speedups, where the algorithms were not improved at all.
It's the difference between a program running overnight vs. in 30 seconds. And if there are problems, the iteration speed from that is huge.
To be fair, Python as implement today is horribly slow. You could leave the language the same but apply all the tricks and heroic efforts they used to make JavaScript fast. The language would be the same, but the implementations would be faster.
Of course, in practice the available implementations are very much part of the language and its ecosystems; especially for a language like Python which is so defined by its dominant implementation of CPython.
But a lot of the monkey-patching kind of things and dynamism of python also means a lot of those sorts of things have to be re-checked often for correctness, so it does take a ton of optimizations off the table. (Of course, those are rare corner cases, so compilers like pypy have been able to optimize for the "happy case" and have a slow fall-back path - but pypy had a ton of incompatibility issues and now seems to be dying).
Doesn't the Python VM have inline caches? [0]
It didn't used to.
EDIT: python 3.11+: https://peps.python.org/pep-0659/
Later on I also changed some of the algorithms to faster ones, but their impact was much lower than the language change.
Which is only to say: that rewrite away from python story can also work to show python doing its job. Risk reduction, scaffolding, MVP validation.
A bunch of low level binaries stitched together with shell scripts is a lot faster than python, so not really sure what the point of this comparison is.
Python is an extremely versatile language, but if what you're doing is computing hashes and diffs, and generally doing entirely CPU-bound work, then it's objectively the wrong tool, unless you can delegate that to a fast, native kernel, in which case you're not actually using Python anymore.
That's often true, but not "every single time".
I doubt it wouldn't be significantly slower. I can't disprove it's possible to do this but it's totally possible for you to prove your claim, so I'd argue that the ball is in your court.
One of the reason mercurial lost the dvcs battle is because of its performance - even the mercurial folks admitted that was at least in part because of python
The reason that some more modern tools, like jj, really blow git out of the water in terms of performance is because they make good choices, such as doing a lot of transformations entirely in memory rather than via the filesystem. It's also because it's written in a language that can execute efficiently. Luckily, it's clear that modern tools like jj are heavily inspired by mercurial so we're not doomed to the ux and performance git binds us with.
Apparently I belong to the same club -- when I'm writing AWK scripts. (Arrays are hashmaps in a trenchcoat there.) Using hashmaps is not necessarily an indictment you apparently think it is, if the access pattern fits the problem and other constraints are not in play.
> It's amazing how much we've brainwashed folks to focus on algorithms and lose sight of how to actually properly optimize code. Being aware of how your code interacts with cache is incredibly important.
By the time you start worrying about cache locality you have left general algorithmic concerns far behind. Yes, it's important to recognize the problem, but for most programs, most of the time, that kind of problem simply doesn't appear.
It also doesn't pay to be dogmatic about rules, which is probably the core of your complaint, although unstated. You need to know them, and then you need to know when to break them.
No, it's always been true. It's just that at some point people got bored and tired of pointing it out.
It’s amusing you call Git fast. It’s notoriously problematic for large repos such that virtually every BigTech company has made a custom rewrite at some point or another!
For everything I've ever done, git was practically instant (except network IO of course). It's one of the fastest and most reliable tools I know. If it isn't fast for you, chances are you are on a slow Windows filesysrem additionally impeded by a Virus scanner.
The mere fact that Git is unable to handle large binary files makes it an unusable tool for literally every project I have ever worked on in my entire career.
Takes 21 seconds on my work laptop, indeed a corporate Windows laptop with antivirus installed. Majority of that time is simply network I/O. The cloned repository is 276 MB large.
Actually checking the kernel out takes 90 seconds. This amounts to creating 99195 individual files, totaling 2 GB of data. Expect this to be ~10 times faster on a Linux file system.
So what's your problem?
Phabricator and even Gerrit are significantly nicer.
Like, you can do a change that introduced a new API and one that updates all usages.
It's just easier to review those independently.
Or, you may have workflows where you have different versions of schemas and you always keep the old ones. Then you can do two commits (copy X to X+1; update X+1) where the change is obvious, rather than seeing a single diff which is just a huge new file.
I'm sure there's more cases. It's not super common but it is convenient.
Rant incoming...
Boy do I hate Github/Lab/Bucket style code reviews with a burning passion. Who the hell loses code review history? A record of the very thing that made my code better? The "why" of it all, that I am guaranteed to forget tomorrow morning.
Nobody would be using `--force` or `--force-with-lease` as a normal part of development workflow, of their own volition, if they had read that part of the git-push manpage and been horrified (as one should be).
The magit key sequence for this abominable operation is `P "f-u"`. And every single time I am forced to do it, I read "f-u" as it ought to be read.
Rebase-push is the way to do it (patch sets in Gerrit).
Rebase-force-push is absolutely not.
You see, any development workflow inevitably has to integrate changes from at least one other branch (typically latest develop or master), without destroying change history, nor review history. Gerrit makes this trivial.
It's a bit difficult to convey exactly why I'm so rah-rah Gerrit, because it is a matter of day-to-day experience of
- Well, a single commit of a few lines to maybe a hundred lines *is* the correct unit of code review, rebase, revert etc. Manually "Sizing PRs" to that review context size is utter BS. I have better things to do in life than to book-keep PR sizes. Make a single well-contained, revertible commit. Then keep making those. And now you have a commit history that is clean, that you can merge, bisect, and bulk-revert at will. Octopus merges are a good thing. `git-log` is *designed* to let us view changes in any sequence we wish, *including* the so-called "linear" history. `git log --online`.
- Trivial for committer to send up reviews-preserving rebase-push responses to commit reviews (NO force-push, ever --- that's an "admin" action to *evict* / permanently wipe out disaster scenarios such as when someone accidentally commits and pushes out a plaintext secret or a giant blob of the executable of the source code etc.).
- Fast-for-the-reviewer, per-commit, diff-based, inline-commenting code reviews.
- The years-apart experience of being able to dig into any part of one's (immutable) software change history to offer a teaching moment to someone new to the team.
... to name a few key ones.(edit: add point about review size)
Nothing since (Gerrit, Reviewboard, Github, Critique) has measured up...
See https://stackoverflow.com/questions/20756320/how-to-prevent-...
I hope they fixed phabricator in the meantime.
One merged pr is a unit of change, at the end of the day the steps you took to produce it aren't relevant to others.
My opinion of course, I'm open to understanding why preserving individual commits is beneficial
See how the Linux kernel handles git history to see a good example of non-linear history and where it helps. They use merge commits, ie commits with more than one ancestor, all the time.
- merge some commits independently when partial work is ready.
- mark some commit as reviewed.
- UI to do interactive rebase and and squash and edit individual commits. (I can do that well from the command line, but not when using the GitHub interface, and somehow not everyone from my team is familiar with that)
- ability to attach a comment to a specific commit, or to the commit message.
- better way to visualize what change over time in each forced push/revision (diff of diff)
Git itself already has the concept of commit. Why put this "stacked PR" abstraction on top of it?
Or is there a difference I don't see?
The idea is that it allows you to better handle working on top of stuff that's not merged yet, and makes it easier for reviewers to review pieces of a larger stack of work independently.
It's really useful in larger corporate environments.
I've used stacked PRs when doing things like upgrading react-native in a monorepo. It required a massive amount of changes, and would be really hard to review as a single pull request. It has to be landed all at once, it's all or nothing. But being able to review it as smaller independent PRs is helpful.
Stacking PRs is also useful even when you don't need to merge the entire stack at once.
Ahem, pioneered by gerrit. But actually, I'm almost certain even that wasn't original art. I think gerrit just brought it to git.
https://github.com/rietveld-codereview/rietveld https://en.wikipedia.org/wiki/Rietveld_(software) https://codereview.appspot.com/
1: https://git-scm.com/docs/git-rebase#Documentation/git-rebase...
I don't use Github but I do work at one of the companies that popularized this workflows and it is extremely not a big deal. Pull, rebase, resolve conflicts if necessary, resubmit.
Especially since you get all of the same advantages with plain old stream on consciousness commits and merges using:
git merge --no-ff
git log --first-parent
git bisect --first-parent
I've switched over pretty much entirely to Jujutsu (or JJ), which is an alternative VCS that can use Git as its backend so it's still compatible with Github and other git repos. My colleagues can all use git, and I can use JJ without them noticing or needing to care. JJ has merges, and I still use them when I merge a set of changes into the main branch once I've finished working on it, but it also makes rebases really simple and eliminates most of the footguns. So while I'm working on my branch, I can iteratively make a change, and then squash it into the commit I'm working on. If I refactor something, I can split the refactor out so it's in a separate commit and therefore easiest to review and test. When I get review feedback, I can squash it directly into the relevant commit rather than create a new commit for it, which means git blame tends to be much more accurate and helpful - the commit I see in the git blame readout is always the commit that did the change I'm interested in, rather than maybe the commit that was fixing some minor review details, or the commit that had some typo in it that was fixed in a later commit after review but that relationship isn't clear any more.
And while I'm working on a branch, I still have access to the full history of each commit and how it's changed over time, so I can easily make a change and then undo it, or see how a particular commit has evolved and maybe restore a previous state. It's just that the end result that gets merged doesn't contain all those details once they're no longer relevant.
What's funny is how much better I understand git now, and despite using jj full time, I have been explaining concepts like rebasing, squashing, and stacked PRs to colleagues who exclusively use git tooling
> So while I'm working on my branch, I can iteratively make a[...]which means git blame tends to be much more accurate and helpful
Everything here I can do easily with Magit with a few keystroke. And magit sits directly on top of git, just with interactivity. Which means if I wanted to I could write a few scripts with fzf (to helps with selection) and they would be quite short.
> And while I'm working on a branch, I still have access to the full history of each commit...
Not sure why I would want the history for a specific commit. But there's the reflog in git which is the ultimate undo tool. My transient workspace is only a few branches (a single one in most cases). And that's the few commits I worry about. Rebase and Revert has always been all I needed to alter them.
And I don't rebase or squash because I need provenance in my job.
Surprisingly it never gained the adoption it deserved.
PR/MR is an "atomic" change (ideally the smallest change that can be landed separately - smallest makes it easier to review, bisect and revert)
Individual commits (or what "versions" are in Phabricator) are used for the evolution of the PR/MR to achieve that change.
But really I have 2 use cases for the commits:
1. the PR/MR is still too big, so I split it into individual commits (I know they will land together)
2. I keep the history of the evolution of the PR/MR in the commits ("changed foo to bar cause its a better approach")
Perhaps a future iteration of this feature will at least allow us to do something like merge just steps of it if they can be reordered.
Right now I manually do "stacked PRs" like this:
main <- PR A <- PR B (PR B's merge target branch is PR A) <- PR C, etc.
If PR B merges first, PR A can merge to main no problems. If PR A merges to main first, fixing PR B is a nightmare. The GitHub UI automatically changes the "target" branch of the PR to main, but instantly conflicts spawn from nowhere. Try to rebase it and you're going to be manually looking at every non-conflicting change that ever happened on that branch, for no apparent reason (yes, the reason is that PR A merging to main created a new merge commit at the head of main, and git just can't handle that or whatever).
So I don't really need a new UI for this, I need the tool to Just Work in a way that makes sense to anyone who wasn't Linus in 1998 when the gospel of rebase was delivered from On High to us unwashed Gentry through his fingertips..
git rebase --onto <new_commit_sha_generated_by_squash> <original_commit_sha_from_tip_of_merged_branch> <branch_name>
So for ex in this scenario: PR1: main <- A, B (branch1)
PR2: main <- A, B, C, D (branch2)
PR3: main <- A, B, C, D, E, F (branch3)
When PR 1 and 2 are squash merged, main now looks like: S1 (squash of A+B), S2 (squash of C+D)
Then we run the following: git rebase --onto S2 D branch3
Which rewrites branch3 to: S1, S2, E, F
This operation moves the unique commits from the unmerged branch and replays them on top of the newly squashed commits on the base branch, avoiding any merge conflicts.I’m conflicted about it, seems like a good convenience, but I wouldn’t want my team to get dependent on an exclusive feature of a single provider
No idea if this feature fixes this.
Edit: Hopefully `gh stack sync` does the rebasing correctly (rebase --onto with the PR A's last commit as base)
Yeah, and I kind of see how git gets confused because the squashed commits essentially disappear. But I don't know why the rebase can't be smart when it sees that file content between the eventual destination commit (the squash) is the same as the tip of the branch (instead of rebasing one commit at a time).
main <- PR A <- PR B
Then you'll have main, squashed A
\
\-> PR A -> PR B
The tip of B is the list of changes of both A and B, while the tip of main is now the squashed version of the changes of A. Unless a branch tracks the end of A in the PR B, It looks like more you want to apply A and B on top of A again.A quick analogy to math
main is X
A is 3
B is 5
Before you have X + 3 + 5 which was equivalent to X + 8, but then when you squash A on on X, it looks like (X + 3) + (3 + 5) from `main`'s point of view, while from B, it should be X + (3 + 5). So you need to rebase B to remove its 3 so that it can be (X + 3) + 5.Branches only store the commits at the top. The rest is found using the parent metadata in each commits (a linked list. Squashing A does not remove its commits. It creates a new one, and the tip of `main` as its parent and set the new commit as the tip of `main`. But the list of commits in B still refer to the old tip of `main` as their ancestor and still includes the old commits of A. Which is why you can't merge the PR because it would have applies the commits of A twice.
All you need to do is pull main, then do an interactive rebase with the next branch in your stack with ‘git rebase -i main’, then drop all the commits that are from the branch you just merged.
That said, after the squash merge of A and git fetch origin, you want something like git rebase --update-refs --onto origin/main A C (or whatever the tip of the chain of branches is)
The --update-refs will make sure pr B is in the right spot. Of course, you need to (force) push the updated branches. AFAICT the gh command line tool makes this a bit smoother.
I don't see how there is any other way to achieve this cleanly, it's not a git thing, it's a logic thing right?
The update branch button works normally when I don't stack the PRs, so I don't know. It just feels like a half baked feature that GitHub automatically changes the PR target branch in this scenario but doesn't automatically do whatever it takes for a 'git merge origin/main' to work.
Those are not hallucinated. PR B still contains all the old commits of A which means merging would apply them twice. The changes in PR B are computed according to the oldest commits belonging to PR B and main which is the parent of squashed A. That would essentially means applying A twice which is not good.
As for updating PR B, PR B doesn't know where PR A (that are also in PR B) ends because PR A is not in main. Squashed A is a new commit and its diff corresponds to the diff of a range of commits in PR B (the old commits of PR A), not the whole B. There's a lot of metadata you'd need to store to be able to update PR B.
I don't think we need to store any additional metadata to make the rebase just slightly more smarter and able to skip over the "obvious" commits in this way, but I'm also just a code monkey, so I'm sure there are Reasons.
The github UI may change the target to main but your local working branch doesn't, and that's where you `rebase --onto` to fix it, before push to origin.
It's appropriate for github to automatically change the target branch, because you want the diff in the ui to be representative. IIRC gitlab does a much better job of this but this is already achievable.
What is actually useful with natively supported stacks is if you can land the entire stack together and only do 1 CI/actions run. I didn't read the announcement to see if it does that. You typically can't do that even if you merge PR B,C,D first because each merge would normally trigger CI.
EDIT: i see from another comment (apparently from a github person) that the feature does in fact let you land the entire stack and only needs 1 CI run. wunderbar!
I mean if you've got a feature set to merge into dev, and it suddenly merges into main after someone merged dev into main then that's very annoying.
I never understood the PR=branch model GitHub defaulted to. Stacked commits (ala Phabricator/Gerrit) always jived more with how my brain reasons about changes.
Glad to see this option. I guess I'll have to install their CLI thing now.
presenting only cli commands in announcement wasn't a good choice
You can also run a combination of these. For ex, use another tool like jj to develop locally, push up the branches, and use the gh CLI to batch create a stack of n PRs, without touching local state.
Probably relies on some internal metadata.
Wait 10 minutes and you’re done.
Everyone will have their own way of structuring stacks, but I've found it great for the agent to plan a stack structure that mirrors the work to be done.
I mean a branch is just jamming a flag into a commit with a polite note to move the flag along if you're working on it. You make a long trail, leave several flags and merge the whole thing back.
Of course leaving multiple waypoints only makes sense if merging the earlier parts makes any sense, and if the way you continue actually depends on the previous work.
If you can split it into several small changes made to a central branch it's a lot easier to merge things. Otherwise you risk making a new feature codependent on another even if there was no need to.
I hope the Gitub CLI will include syncing[3] 'stacks' locally with upstream in a similar way.
[1]: https://www.git-town.com/stacked-changes.html
[2]: https://github.com/marketplace/actions/git-town-github-actio...
Curious whether this changes anything for the AI-assisted workflow. Right now I let Claude Code work on a feature branch and it naturally produces one big diff. Stacked PRs could be interesting if agents learned to split their own work into logical chunks.
To me, stacked PRs seems overly complicated. It seems to boil down to propagating git rebases through stacks of interdependent branches.
I'm fine with that as long as I don't have to deal with people force pushing changes and routinely rewriting upstream history. It's something you probably should do in your own private fork of a repository that you aren't sharing with anyone. Or if you are, you need to communicate clearly. But if the goal is to produce a stack of PRs that in the end merge cleanly, stacked PRs might be a good thing.
As soon as you have multiple collaborators working on a feature branch force pushing can become a problem and you need to impose some rules. Because otherwise you might end up breaking people's local branches and create work for them. The core issue here is that in many teams, people don't actually fork the main repository and have push access to the main repository. Which emulates the central repository model that people were used to twenty years ago. Having push access is not normal in most OSS projects. I've actually gotten the request from some rookie developers that apparently don't get forking to "please give me access to your repository" on some of my OSS projects.
A proper pull request (whether stacked or not) to an OSS project needs to be clean. If you want to work on some feature for weeks you of course need mechanisms to stay on top of up stream changes. OSS maintainers will probably reject anything that looks overly messy to merge. That's their job.
* It amounts to doing N code reviews at once rather than a few small reviews which can be done individually
* Github doesn't have any good UI to move between commits or to look at multiple at once. I have to find them, open them in separate tabs, etc.
* Github's overall UX for reviewing changes, quickly seeing a list of all comments, etc. is just awful. Gerrit is miles ahead. Microsoft's internal tooling was better 16 years ago.
* The more commits you have to read through at once the harder it is to keep track of the state of things.
I truly do not comprehend this view. How is reviewing N commits different from/having to do less reviews reviewing N separate pull requests? It's the same constant.
A chain of commits:
* Does not go out for review until the author has written all of them
* Cannot be submitted even in partial form until the reviewer has read all of them
Reviewing a chain of commits, as the reviewer I have to review them all. For 10 commits, this means setting aside an hour or whatever - something I will put off until there's a gap in my schedule.
For stacked commits, they can go out for review when each commit is ready. I can review a small CL very quick and will generally do so almost as soon as I get the notification. The author is immediately unblocked. Any feedback I have can be addressed immediately before the author keeps building on top of it.
Single PR with commits A, B, C: You must merge all commits or no commits. If you don't approve of all the commits, then none of the commits are approved.
3 stacked PRs: I approve PR A and B, and request changes on PR C. The developer of this stack is on vacation. We can incrementally deliver value by merging PRs A and B since those particular changes are blocking some other engineer's work, and we can wait until dev is back to fix PR C.
This isn't reddit people. You're not supposed to downvote just because you disagree. Downvotes are for people who are being assholes, spamming, etc...
If you disagree with a take, reply with a rebuttal. Don't just click downvote.
That said, while he hasn't posted here for a long time, this is still in the guidelines:
> Please don't post comments saying that HN is turning into Reddit. It's a semi-noob illusion, as old as the hills.
[1] https://andrewlock.net/working-with-stacked-branches-in-git-...
Until you can make it effortless, maintaining a substantial commit structure and constantly rebasing to add changes to the proper commit quickly turns into more effort than just waiting to the end and manually editing a monster diff into multiple sensible commits. But we take the challenge and tell ourselves we can do better if we're proactive.
Stacked PRs in my experience has primarily been a request to merge in a particular order. If you're the only merger, as in GP's case, there's no need to request this of yourself.
----
OK, I found this from official docs, so this feature is now quite useless to me:
> Can stacks be created across forks?
> No, Stacked PRs currently require all branches to be in the same repository. Cross-fork stacks are not supported.
One part that seems like it's going to feel a little weird is how merging is set up[1].
That is, if I merge the bottom of the stack, it'll rebase the others in the stack, which will probably trigger a CI test run. So, if I have three patches in the stack, and I want to merge the bottom two, I'd merge one, wait for tests to run on the other, merge the second vs. merge just those two in one step (though, without having used it, can't be sure about how this'd work in practice—maybe there's some way to work around this with restacking?)
[0]: <https://docs.gitlab.com/cli/stack/>
[1]: <https://github.github.com/gh-stack/guides/stacked-prs/#mergi...>
As we have it designed currently, you would have to wait for CI to pass on the bottom two and then you can merge the bottom two in one step. The top of the stack would then get rebased, which will likely trigger another CI run.
Thanks for the callout - we'll update those docs to make it clear multiple PRs can be merged at once.
My biggest gripe with GitHub when working with stacks – and something that's not clarified in these docs – is whether fast-forward merges are possible. Its "Merge with rebase" button always rewrites the commit. They do mention that the stack needs to be rebased in order to merge it. My workaround has been `git merge --ff-only top-branch-of-stack` to merge the entire stack locally into main (or anything in between actually) and then push. GitHub neatly recognizes that each PR in the stack is now in main and marks them all as merged. If there are subsequent PRs that weren't merged it updates the base branch.
Having said that, it's great to see GitHub getting a proper UI for this. It's also great that it understands the intent that branch B that goes on top of branch A is a stack and thus CI runs against. I just hope that it's not mandatory to use their CLI in order to create stacks. They do cover this briefly in the FAQ[3], but it might be necessary to use `gh stack init --adopt branch-a branch-b branch-c`. On the other hand, if that removes the need to manually create the N PRs for my stack, that's nice.
[1]: https://git-scm.com/docs/git-rebase#Documentation/git-rebase...
[2]: https://github.com/tummychow/git-absorb
[3]: https://github.github.com/gh-stack/faq/#will-this-work-with-...
git --config push.default=matching push --force-with-lease --force-if-includes
In other words, I force push all branches that have a matching upstream by changing my config on the fly....or you don't bother with all that and simply do:
- gh stack init
- gh stack push
- gh stack submit
The point is that I want to use Git, a tool and skill that is portable to other platforms.
You want to use git.
Most people around you want to get things done.
Just using git, you'd send a set of patches, which can be reviewed, tested and applied individually.
The PR workflow makes a patch series an undivisible set of changes, which must be reviewed, tested and applied in unison.
And stacked PRs tries to work around this issue, but the issue is how PRs are implemented in the first place.
What you really want is the ability to review individual commits/patches again, rather than work on entire bundles at once. Stacked PRs seems like a second layer of abstraction to work around issues with the first layer of abstractions.
Then the commits in the PR are not held to the standard of being acceptable to apply, and they are squashed together when the PR is merged.
This allows for a work flow in which up until the PR is merged the “history of developing the PR” is preserved but once it is merged, the entire PR is applied as one change to the main branch.
This workflow combined with stacked PRs allows developers to think in terms of the “smallest reviewable and applicable change” without needing to ensure that during development their intermediate states are safe to apply to main.
The traditional tools (mailing-lists, git branches, Phabricator) represented each change as a difference between an old version of the code and the proposed new version. I believe Phabricator literally stored the diff. They were called “diffs” and you could make a new one by copying and pasting into a <textarea> before pressing save*.
The new fangled stuff (GitHub and its clones) recorded your change as being between branches A and B, showed you the difference on the fly, and let you modify branch B. After fifteen years of this we are now seeing the option for branch A to be something other than main, or at least for this to be a well supported workflow.
In traditional git land, having your change as a first class object — an email or printout or ph/D1234 with the patch included — was the default workflow!
*Or some other verb meaning save.
Stacked PRs are not breaking up a set of commits into divisible units. Like you said, you can already do that yourself. They let you continue to work off of a PR as your new base. This lets you continue to iterate asynchronously to a review of the earlier PRs, and build on top of them.
You often, very often, need to stage your work into reviewer-consumable units. Those units are the stack.
This will help some since you can more easily split PRs into units that make sense to squash at the end, but it still seems like not doing this on a per-commit basis is a disadvantage compared to Gerrit. With Gerrit I can use all the built-in Git rebase/squash/fixup tools to manage the commit stack and push everything in one go. I don't think there's a nearly as convenient a way to work with stacked branches in Git.
Also the rationale for having a chain of branches pointing to each other was so the diff in a PR shows just the relevant changes from the specific branch, not the entire set of changes going back to the parent/trunk.
Curious how you're thinking about it?
There is already an option to enable review comments on individual commits (see the API endpoint here: https://docs.github.com/en/rest/guides/working-with-comments...). Self-stacking PRs seem redundant.
Graphite (which they seem to be inspired by) has frozen branches exactly for that use case:
I notice a lot of examples just vaguely mention "oh, you can have others review your previous changes while you continue working", but this one doesnt make sense to me. Often times, the first set of commits doesn't even make it to the end result. I'm working on a feature using lexical, and at this point I had to rewrite the damn thing 3 times. The time of other devs is quite valuable and I can't imagine wasting it by having them review something that doesn't even make it in.
Now, I have been in situations where I have some ready changes and I need to build something on top. But it's not something just making another branch on top + rebase once the original is merged wouldn't solve.
Is this really worth so much hype?
Imagine you have some task you are working on, and you wish to share your progress with people in bite sized chunks that they can review one at a time, but you also don’t want to wait for their reviews before you continue working on your task.
Using a stacked set of PRs you can continue producing new work, which depends on the work you’ve already completed, without waiting for the work you’ve already completed to be merged, and without putting all your work into one large PR.
> The time of other devs is quite valuable and I can't imagine wasting it by having them review something that doesn't even make it in.
this is now what stacked diffs are for. stacked diffs doesn't mean putting up code that isn't ready. for example you are updating some library that needs an API migration, or compiler version that adds additional stricter errors. you need to touch hundreds of files around the repository to do this. rather than putting up one big diff (or PR) you stack up hundreds of them that are trivial to review on their own, they land immediately (mitigating the risk of merge conflicts as you keep going) then one final one that completes the migration.
So, when I saw this announcement seemed interesting but don’t see the point of it yet.
Not for me, but I'm glad it fits other people's workflows. I just hope it doesn't encourage people to try make poorly reasoned changes!
I've just written those smaller PRs at once, or in quick enough succession that the previous PRs weren't merged before the later ones were ready. And the later ones relied on the previous ones because that's how working on a feature works.
The earlier PRs are absolutely reviewable and testable without relying on the later ones. The later ones are just treating the earlier ones as part of the codebase. I.e. everything here looks like two different PRs except the timing.
An obvious example would be "implement API for a feature" and then "implement UI that uses that API". Two different PRs. The second fundamentally relies on the first.
1) API implementation - Including tests and docs this should be perfectly acceptable to merge and review independently 2) UX implementation - Feature flagged, dummy API responses, easy to merge + review 3) One quick "glue" PR where the feature can be integration tested etc
This prevents awful merge conflicts, multiple rounds of increasingly complex stacked reviews, and a host of other annoyances.
Is there any reason that the stacked PR workflow is better that I'm ignoring or overlooking?
It's not a simple problem to solve, we can't all just jump because someone finished some work after all. But if the PRs are OK to rubber stamp, and merge, and they're safely behind a feature flag, then it could just be as simple as letting the submitter merge without the need for an extra review. That can of course be contentious, but then we can ask "why not?" and figure out what non-human gateways need to be added to help make it possible etc.
I'm finding myself increasingly interested in understanding what friction can be removed from the software review, merge and release process, without sacrificing safe, well tested, understandable code that follows good standards.
I have never understood what this even means.
Either changes are orthogonal (and can be merged independently), or they’re not. If they are, they can each be their own PR. If they’re not, why do you want to review them independently?
If you reject change A and approve change B, nothing can merge, because B needs A to proceed. If you approve change A and reject change B, then the feature is only half done.
Is it just about people wanting to separate logical chunks of a change so they can avoid get distracted by other changes? Because that seems like something you can already do by just breaking a PR into commits and letting people look at one of those at a time.
I’ve tried my best to give stacked-diff proponents the benefit of the doubt but none of it actually makes sense to me.
> If they’re not, why do you want to review them independently?
For this example, you may want review from both a backend engineer and a frontend engineer. That said, see this too though:
> that seems like something you can already do by just breaking a PR into commits and letting people look at one of those at a time.
If you do this in a PR, both get assigned to review the whole thing. Each person sees the code that they don't care about, because they're grouped together. Notifications go to all parties instead of the parties who care about each section. Both reviews can proceed independently in a stack, whereas they happen concurrently in a PR.
> If you approve change A and reject change B, then the feature is only half done.
It depends on what you mean by "the feature." Seen as one huge feature, then yes, it's true that it's not finished until both land. But seen as two separate but related features, it's fine to land the independent change before the dependent one: one feature is finished, but the other is not.
There are two separate issues you’re bringing up:
- Both groups being “assigned” the PR: fixable with code owners files. It’s more elegant than assigning diffs to people: groups of people have ownership over segments of the codebase and are responsible for approving changes to it. Solves the problem way better IMO.
- Both groups “seeing” all the changes: I already said GitHub lets you view single commits during PR review. That is already a solved problem.
And I didn’t even bring up the fact that you can just open a second PR for the frontend change that has the backend commit as the parent. Yes, the second PR is a superset of the first, but we’ve already established that (1) the second change isn’t orthogonal to the first one and can’t be merged independently anyway, and (2) reviewers can select only the commits that are in the frontend range. Generally you just mark the second PR as draft until the first one merges (or do what Gitlab does and mark it as “depends on” the first, which prevents it from merging until the first one is done.) The first PR being merged will instantly make the second PR’s diff collapse to just the unique changes once you rebase/merge in the latest main, too.
All of this is to explain how we can already do pretty much all of this. But in reality, it’s silly to have people review change B if change A hasn’t landed yet. A reviewer from A may completely throw the whole thing out and tell you to start over, or everything could otherwise go back to the drawing board. Making reviewers look at change B before this is done, is a potential for a huge waste of time. But then you may think reviewers from change B may opt to make the whole plan go back to the drawing board too, so what makes A so special? And the answer is it’s both a bad approach: just make the whole thing in one PR, and discuss it holistically. Code owners files are for assigning ownership, and breaking things into separate commits is to help people look at a subset of the changes. (Or just, like, have them click on the folder in the source tree they care about. This is not a problem that needs a whole new code review paradigm.)
Code owners automatically assigns reviewers. You still end up in the state where many groups are assigned to the same PR, rather than having independent reviews.
> I already said GitHub lets you view single commits during PR review.
Yes, you can look at them, but your review is still in the context of the full PR.
> And I didn’t even bring up the fact that you can just open a second PR for the frontend change that has the backend commit as the parent.
The feature being discussed here is making this a first-class feature of the platform, much nicer to use. The second PR is "stacked" on top of the first.
> Yes, you can look at them, but your review is still in the context of the full PR.
Why is this a bad thing? I don’t get it. This has literally never been a problem once in my career. Is the issue that people can’t possibly scroll past another discussion? Or… I seriously am racking my brain trying to imagine why it’s a bad thing to have more than one stakeholder in a discussion.
I can think of a lot of reasons why doing the opposite, and siloing off discussions, leads to disaster. That is something I’ve encountered constantly in my career. We start out running an idea past group A, they iterate, then once we reach a consensus we bring the conclusion to group B and they have concerns. But oh, group A already agreed to this so you need to get on board. So group B feels railroaded. Then more meetings are called and we finally bring all the stakeholders together to discuss, and suddenly hey, group A and B both only had a partial view of the big picture, and why didn’t we all discuss this together in the first place? That’s happened more times in my career than I can count. The number of times group B is mad that they have to move their finger to scroll past what group A is talking about? Exactly zero.
This isn't about siloing discussions: it's about focus. You can always see the full stack if you want to go look at the other parts, the key is that you don't have to.
The goal is to get thoroughly reviewed changes. It's much easier to review five 100 line changes than one 500 line one, and it's easier to review five 500 line changes than it is a 2500 line one. Keeping commits small and tightly reviewed leads to better outcomes in the end. Massive PRs lead to rubber stamps of +1.
I agree that that scenario sounds like a nightmare. But I don't think that a PR is the right place to solve that problem: it sounds like something that should have been sorted before any of the code was written in the first place.
This is true if the changes are orthogonal and are truly independent. One should always favor small independent changes if one can.
But when changes are all actually part of the same unit, and aren’t separable (apart from maybe the first of N of them which may be mergeable independently), proponents always seem to advocate that stacked diffs can somehow change this fact. “Oh if only we had stacked diffs we could break this into smaller changes”, ignoring the fact that no, they’d still be ordered and dependent on one another.
Stacked diffs seem like a UI convenience for reviewers… that’s fine I guess. GitHub is basically what you get when you ask the question “how can we make code review as tedious and unhelpful as possible”, and literally anything would be better than what we have (seriously I could fill a book with how bad GitHub is. I don’t think I could design a worse experience if I tried.) So, maybe I should just be happy they’re trying anything.
This is the model that the kernel uses, as well as tons of other projects (any Gerrit user, for example), and so it has gotten real-world use and at scale. That said, everyone is also entitled to their preferences :)
Nah.
The kernel uses a mailing list, and a “review” means a mailing list thread. With some nice CLI tools to integrate with git when you want to actually apply the patch (or start a review thread.)
In that world, “[PATCH 2/5]” (or whatever) in the subject title, and a different CC list for each patch, is a nice way to be able to ensure different subsets of the patch series have different discussions. That’s great.
But if you’re going to compare this to a GitHub UI, you have to choose the basis for comparison, because the two are so utterly different. Choosing one aspect (can we make sure discussions are kept separate), and saying “therefore the kernel uses stacked diffs” is a huge misrepresentation of how different GitHub’s approach is.
Because the kernel approach is the platonic ideal of a code review: it’s a simple threaded discussion between stakeholders, centered around a topic (the patch, which is inlined right in the email.) I would wager close zero kernel maintainer actually look at the diffs exclusively via their email client. They probably just check out the changes locally and look at them, and the purpose of the mailing list is to facilitate focused discussion on parts of the change (which is all we really want, in the end.)
GitHub has so thoroughly shit the bed on actually developing a good model of “threaded discussion about a change”, that you have to change the way you think about git’s model to fix how awful GitHub is at allowing review discussion to stay focused. You shouldn’t need to think about stacked diffs and multiple PR’s. You should use git branches as intended, multiple commits representing changes, and a merge meaning “this branch makes it or not.” That GitHub’s UI for discussing subsets of a change is so abysmal, does not mean the model is wrong. It means their discussion system is so abysmal that a mailing list TUI can run circles around it. Fixing this is GitHub’s problem, and doesn’t require any changes to how PR’s should be split up.
If you have a 2500-line PR with 5 500-line commits, GitHub should not require you to split things up further in any way, just to unfuck their discussion system.
Random idea I spent 10 seconds thinking about: let me start a “here’s a thread discussing the UI changes” and add folks to it, and “here’s a thread discussing the backend changes”, and add folks to that. I can then say “let’s not merge this until both threads are green”. You still see the whole change in the UI. (You can click directories to drill into the changes, that solves the “but the diff is too big” issue.) Discussion on a chunk of the diff is scoped to a discussion thread, which you select when sending the message. Thus, all discussion on any part of the diff is still scoped to a “discussion thread” of arbitrary subsets of stakeholders.
None of this needs me to change how I split up my git branches, an entire logical change is still either “merged” or “not-merged” (seriously who cares about the Pyrrhic victory of merging only change 1/N), and if we want to limit scopes of discussion to subsets of a change, we can just… do that.
All of the advantages, like "it’s a simple threaded discussion between stakeholders, centered around a topic", is exactly why people like stacked diffs over PRs.
GitHub is doing "stacked PRs", which is like stacked diffs but more like PRs in the sense that they're stacked branches rather than stacked diffs. I agree that this seems less ideal, but they also are putting it into an existing project, rather than rebuilding everything around it. There's pros and cons to both approaches, but I agree that I'd prefer a native system built for this, personally. I'm still glad they're going to be popularizing the general concept.
My point is that the LKML and what GitHub do is so different that the definition of “stacked diffs in general” can only describe a tiny aspect of each, if you want to call both of their approaches by the same name. From where I sit, the only common element between them is “they offer a way to keep discussion separated.”
If that’s all people are actually complaining about, there are a thousand better ways to “keep discussion separated” that don’t require me to pretend that it’s ok that only a subset of my branch is ok to merge.
In git, a branch is the thing you either merge or don’t. You merge multiple commits at once, or you don’t. It’s a great model. Breaking up the branch into smaller pieces, and giving people the impression it’s ok to merge the first commit but not the rest, just to unfuck the discussion UX, is putting the cart before the horse. I make a branch strictly because I want it to either all merge or none of it merge. It’s the only sensible approach in my book. If a discussion system is so bad that this is unworkable, it means the discussion system is bad, it doesn’t mean the conceptual model of a merge is bad.
That's fine, what I mean is, when we started this convo, I thought you were asking about the general concept of stacked diffs, not the specifics of what GitHub is releasing here. That's my mistake for misunderstanding, sorry about that.
This is also (assumedly, anyway) why they're calling this "stacked PRs" and not "stacked diffs," because what they're doing is slightly different than Gerrit, Phabricator, Critique, etc.
After thinking about the whole thing I think I can summarize my opinion a lot better now:
Stacked diffs are a category error. Units of discussion, and units of integration, should not be conflated.
A branch is my unit of intended integration: merge all of it or none of it. The fact that reviewers need smaller slices to discuss does not imply those slices should become independently landable history objects. That’s a UX concern for the review tool, not something I should have to encode into Git history.
The ideal system would let me seed discussion however I want (by commit, by path, by subsystem, by semantic region of the diff, etc) without forcing me to pretend those are separate merge units.
Github nails the "merge unit" (CI runs against the whole branch, the branch either merges or doesn't, etc), but absolutely fumbles in the discussion part. I hate that I'd have to change the merge unit just to fix their discussion UX.
for example, this stack adds a search bar: https://tangled.org/tangled.org/core/pulls/1287
- the first PR in the stack creates a search index.
- the second one adds a search API handler.
- the last few do the UI.
these are all related. you are right that you can do this by breaking a change into commits, and effectively that is what i do with jujutsu. when i submit my commits to the UI, they form a PR stack. the commits are individually reviewable and updatable in this stacking model.
gh's model is inherently different in that they want you to create a new branch for every new change, which can be quite a nuisance.
have written more about the model here: https://blog.tangled.org/stacking/
> - the second one adds a search API handler.
> - the last few do the UI.
So you're saying you're going to merge (and continuously integrate, perhaps to production) a dangling, unused search index, consuming resources with no code using it, just to make your review process easier?
It's very depressing that review UX is so abysmal that you have to merge features before they're done just to un-fuck it.
Why can't the change still be a big branch that is either all merged or not... and people can review it in chunks? Why do we require that the unit of integration equals the unit of review?
The perverse logic always goes something like this:
"This PR is too big, break it up into several"
Why?
"It's easier to review small, focused changes"
Why can't we do that in one PR?
"Because... well, you see GitHub's UI makes it really hard to ..."
And that ends up being the root-cause answer. I should be able to make a 10,000 line change in a single commit if I want, and reviewers should be able to view subsets of it however they want: A thread of discussion for the diffs within the `backend` folder. A thread of discussion for the diffs within the `frontend` folder, etc etc. Or at the very least I should be able to make a single branch with multiple commits based on topic (and under no obligation for any of them to even compile, let alone be merge-able) and it should feel natural to review each commit independently. None of this should require me to contort the change into allowing integration partially-completed work, just to allow the review UX to be manageable.
Just covering the review process:
Yes, you can structure your PR into 3 commits to be reviewed separately. I occasionally structure my PRs like this - it does help in some cases. But if those separate parts are large, you really want more structure around it than just a commit.
For example, let's say you have parts A, B and C, with B depending on A, and C depending on B.
1. I may want to open a PR for A while still working on B. Someone may review A soon, in which case I can merge immediately. Or perhaps it will only be reviewed after I finished C, in which case I'll use a stacked PR. 2. The PR(s) may need follow up changes after initial review. By using stacked PRs instead of just separate commits, I can add more commits to the individual PRs. That makes it clear what parts those commits are relevant to, and makes it easy to re-review the individual parts with updated changes. Separate commits don't give you that.
Stacked PRs is not a workflow I'd use often, but there are cases where it's a valuable tool.
Then apart from the review process, there are lots of advantages to keeping changes small. Typically, the larger a change, the longer it lives in a separate branch. That gives more time for merge conflicts to build up. That gives more time for underlying assumptions to change. That makes it more difficult to keep a mental map of all the changes that will be merged.
There are also advantages to deploying small changes at a time, that I won't go into here. But the parent's process of potentially merging and deploying the search index first makes a lot of sense. The extra overhead of managing the index while it's "unused" for a couple of days is not going to hurt you. It allows early testing of the index maintenance in production, seeing the performance overhead and other effects. If there's an issue, it's easy to revert without affecting users.
The overall point is that as features become large, the entire lifecycle becomes easier to manage if you can split it into smaller parts. Sometimes the smaller parts may be user-visible, sometimes not. For features developed in a day or two, there's no need to split it further. But if it will span multiple weeks, in a project with many other developers working on, then splitting into smaller changes helps a lot.
Stacked PRs is not some magical solution here, but it is one tool that helps manage this.
PS: I love the concept of tangled. I currently use `sourcehut` but may soon move to tangled.
you have hundreds or thousands of files to fix. that is unreviewable as a single commit, but as a per-file, per-library, per-oncall, etc. commit it is not that bad.
Why is it intrinsically unreviewable as a single commit? Why can't the discussion/review system allow scoping discussions to a single folder of the change, or a single library, or a particular code-owner's "slice" of the repo, etc? The answer to this question is always unsatisfactory to me. It always ends up being "because GitHub's UI makes it hard to <foo>" and it's just taken as an immutable law of the universe that we're stuck with that UI's limitations.
If a change is huge, find some basis by which to discuss it in smaller chunks. That basis doesn't have to be the PR itself (such that you have to make smaller PR's to make discussion manageable.) It can be a subdirectory of the diff. A wildcard-match over the source files. Whatever the case needs to be, the idea is still that the discussion UX shouldn't make reviewing large changes painful.
Why do we tolerate the fact that GitHub doesn't let you say "approved for changes in `frontend/*`" or "approved for the changes I'm a code-owner of", and have the PR check system mark the PR as approved once all slices have been approved? Why do we tolerate that a thousand-file change is "unreviewable"? Instead we have to change our unit of integration, allowing partially-complete work to be merged, just because the review UX sucks.
Why would you waste time faffing about building B on top of a fantasy version of A? Your time is probably better spent reviewing your colleague’s feature X so they can look at your A.
The feature is also half done in this case. The author can fix up the concerns the reviewer had in A and then both can be merged at the same time.
As far as splitting work into different PRs that need coordinated merging, I've only ever encountered that when it's a long lived refactor / feature.
OK, yeah, I’m with you.
> Stacked PRs solve this by breaking big changes into a chain of small, focused pull requests that build on each other — each one independently reviewable.
I don’t get this part. It seems like you are just wasting your own time building on top of unreviewed code in branches that have not been integrated in trunk. If your reviews are slow, fix that instead of running ahead faster than your team can actually work.
Plus there's no review that's instant. Being able to continue working is always better.
> The gh stack CLI handles the local workflow […]
That's not "how it works", that's "how you['re supposed to] use it"… for "how it works" I would've expected something like "the git branches are named foo1 foo2 and foo3 and we recognize that lorem ipsum dolor sit amet…"
…which, if you click the overview link, it says "The CLI is not required to use Stacked PRs — the underlying git operations are standard. But it makes the workflow simpler, and you can create Stacked PRs from the CLI instead of the UI." … erm … how about actually explaining what the git ops are? A link, maybe? Is it just the PRs having common history?
…ffs…
(In case it's not obvious: I couldn't care less for using a GH specific CLI tool.)
They also allow reviewing commits individually, which is very frustrating to do without dedicated support (unless you devolve back to mailing list patch stacks).
I'm not a huge fan, since stacked PRs mean the underlying issues don't get addressed (reviews clearly taking too long, too much content in there), but it seems they want something that works for their customers, right now, as they work in real life.
I guess this is why you're getting downvoted. Commits can be edited.
If I had to guess a reason they were downvoted (and I didn't downvote, to be clear), it's probably because people see stacked diffs as specifically solving "reviews clearly taking too long, too much content in there", and so it feels contradictory. Then again, as I said, I didn't downvote!
It's a big improvement (assuming they've done it right).
Every time I try to do it manually, I wind up screwing everthing up.
Very interested ot check it out.
Here's something that would be useful: To break down an already big PR into multiples that make up a stack. So people can create a stack and add layers, but somehow re-order them (including adding something new at the first position).
I use jj to stack branches so i'll just be using the UI to do github pr stacks.
Usually when you develop a "full stack" thing you continuously massage the backend into place while developing frontend stuff. If you have 10 commits for frontend and 10 for backend, they might start with 5 for backend, then 5 commits to each branch to iron out the interface and communication, and finally 5 commits on the frontend. Let's call these commits B1 through B10 and F1 through F10. Initially I have a backend branch based on main wuth commits B1 through B5.
Then I have a frontend branch based on B5 with commits F1 through F5. But now I need to adjust the backend again and I make change B6. Now I need to rebase my frontend branch to sit on B6? And then I make F6 there (And so on)?
And wouldn't this separation normally be obvious e.g. by paths? If I have a regular non-stack PR with 20 commits and 50 changed files, then 25 files will be in /backend and 25 in /frontend.
Sure, the reviewers who only review /frontend/* might now see half the commits being empty of relevant changes. But is that so bad?
In this model, you tend to want to amend, rather than add more commits. And so:
> they might start with 5 for backend, then 5 commits to each branch to iron out the interface and communication,
You don't add more commits here, you modify the commits in your stack instead.
> Now I need to rebase my frontend branch to sit on B6?
Yes, when you change something lower in the stack, the things on top need to be rebased. Because your forge understands that they're stacked, it can do this for you. And if there's conflicts, let you know that you need to resolve them, of course.
But in general, because you are amending the commits in the stack rather than adding to it, you don't need to move anything around.
> And wouldn't this separation normally be obvious e.g. by paths?
In the simplest case, sure. But for more complex work, that might not be the case. Furthermore, you said you have five commits for each; within those sets of five, this separation won't exist.
Is it?
A PR is basically a cyberspatial concept saying "I, as a dog on the internet, am asking you to accept my patches" like a mailing list - this encourages trying to see the truth in the whole. A complete feature. More code in one go because you haven't pre-agreed the work.
Stacks are for the opposite social model. You have already agreed what you'll all be working on but you want to add a reviewer in a harmonious way. This gives you the option to make many small changes, and merge from the bottom
One thing I keep thinking about in this same direction: even within a single layer of a stack, line-level diffs are still noisy. You rename a function and update x call sites, the diff shows y changed lines. A reviewer has to mentally reconstruct "oh this is just a rename" from raw red/green text.
Semantic diffing (showing which functions, classes, methods were added/modified/deleted/moved) would pair really well with stacks. Each layer of the stack becomes even easier to review when the diff tells you "modified function X, added function Y" instead of just showing changed lines.
I've been researching something in this direction, https://ataraxy-labs.github.io/sem/. It does entity-level diffs, blame, and impact analysis. Would love to see forges like GitHub move in this direction natively. Stacked PRs solve the too much at once problem. Semantic diffs solve the "what actually changed" problem. Together they'd make code review dramatically better.
There seems to be a native stack navigation widget on the PR page, which is certainly a welcome addition.
The most important question though is whether they finally fixed or are going to fix the issues that prevent submitting stacked PRs from forks. I don't see any indication about that on the linked page.
Only downside is that Phabricator is not open source so viewing it in most things sucks. Hoping now I can get a much better experience
I'm old enough to have worked with SVN and young enough to have taught engineers to avoid stacking PR in Git. All wisdom has been lost and will probably be rediscovered in another time by another generation.
Sure, your application has a dependency on that database, but it doesn't necessarily mean you can't deploy the application before having a database. If possible, make it acceptable for your application to stay in a crashloop until your database is online.
2. I'm not a huge fan of having to use a secondary tool that isn't formally a layer around git / like jj as opposed to github
Honestly I don’t see the benefit of smaller prs, except driving vanity scores?
Like I’m not saying you should
If this works as smoothly as it sounds, that'll significantly reduce the overhead!
I’ve been trying to convince my boss to buy Graphite for this, seems like Github is getting their a* in gear after Cursor bought them.
If Jetbrains ever implements support for them in IntelliJ I will be in Heaven.
Has anyone already tried that was a graphite user before?
Stacked PRs are a development method, for managing changes which are separate but dependent on one another (stacked).
The two are orthogonal they can be used together or independently (or not at all).
I can't remember if Gitlab has the same limitations but I do remember trying to use Gitlab's stacked diffs and finding them to not work very well. Can't remember why tbh.
Huh? Some stacks need to land all at once and need to be reviewed (and merged) from the top down. It’s not uncommon, in my org at least, to review an entire stack and merge 3 into 2 and then 2 into 1 and then 1 into main. If 2 merges before 3, you just rebase 3 onto 1.
There’s a special case where certain official orgs can continue to use github.com instead of github.io for their Pages domain, and that’s how you end up with:
https://github.github.com/gh-stack/
from the code:
Should Pages owned by this user be regarded as “Official GitHub properties”?
def github_owned_pages? GitHub.github_owned_pages.include?(login) end
# Orgs/users that are owned by GitHub and should be allowed to use # `github.com` URLs. # # Returns an Array of String User/Organization logins. ...
In practical terms: I manually write a list of PRs, and maintain that list in the description of each of the PRs. Massive duplication. But it clearly shows the merge train.
> This is a docs site that was made to share the spec and CLI for private preview customers that ended up getting picked up. This will move to GitHub docs once it’s in public preview.
A stacked PR allows you to construct a sequence of PRs in a way that allows you to iterate on and merge the isolated commits, but blocks merging items higher in the stack until the foundational changes are merged.
What they do that the single branch cannot is things like "have a disjoint set of reviewers where some people only review some commits", and that property is exactly why it encourages more well-organized commits, because you are reviewing them individually, rather than as a massive whole.
They also encourage amending existing commits rather than throwing fixup commits onto the end of a branch, which makes the original commit better rather than splitting it into multiple that aren't semantically useful on their own.
(FWIW, I'm dealing with this sort of thing at work right now - working on a complex branch, rewriting history to keep it as a sequence of clean testable and reviewable commits, with a plan to split them out to individual PRs when I finish.)
I've done this manually by building a big feature branch and asking an LLM to extract out functionality for a portion of it.
For the former, it would seem to split based on frontend/backend, etc. rather than what semantically makes the most sense and for the latter it would include changes I don't want and forget some I do want. But I haven't tried this a lot.
Also if someone could help me understand: Are these so-called stacked commits not possible with multiple commits on a single branch? I prefer to write my commits as atomic, independent, related changes, on a single branch, with both Git and Mercurial. I am apparently missing something: why can't a better UI simply show a multi-change PR?
In the tool I wrote, you have a single branch with linear history. PRs in the chain are demarcated via commit messages. You then don't need any special rebase / sync commands -- you can use regular `git rebase -i` to reorder commits or edit a commit in the middle of a stack. Literally the only special command I need is "push this branch to github as multiple PRs".
Anyway I hope that alongside the branch-based you've built tool in `gh` that there will be an API that I can target.
I think they have a culture of circumventing 'official' channels and whoever is in charge of a thing is whoever publishes the thing.
I think it's a great way to train users to get phished by github impostors, if tomorrow we see an official download from official.github.com or even official-downloads.github.io, sure it's phishy, but it's also something that github does.
It's also 100% the kind of issues that, if it happens, the user will be blamed.
I would recommend github to stop doing this stuff and have a centralized domain to publish official communications and downloads from. Github.github.com? Come on, get serious.
TL;DR: DO NOT DOWNLOAD ANYTHING from this site, (especially not npm/npx/pnpm/bun/npjndsa) stuff. It's a Github Pages site, just on a subdomain that looks official, theoretically it might be no different from an attacker to obtain access to dksabdkshab.github.com than github.github.com. Even if it is official, would you trust the intern or whoever managed to get a subdomain to not get supply chained? github.github.com just think about it.
The quoted microsoft examples are way worse. I see this with outbound email systems a lot, which is especially dangerous because email is a major surface of attack.