I loathe GitHub PRs because of this. Working at $dayjob the unit of change is the commit, and every commit is reviewed and signed off by at least 1 peer.
And you know what? I love it. Yes, there's some overhead. But I can understand each commit in its entirety. I and my coworkers have caught numerous issues in code with these single-purpose commits of digestible size.
Compare this to GitHub PRs, which tend to be beastly things that are poorly structured (not to mention GitHub's UI only adding to the review problems...) and multipurpose. Reviewing these big PRs with care is just so much harder. People don't care about the commit message, so looking at the git log it's just a mess that's hard to navigate.
I will split up a PR into
- Individual steps a a refactor, especially making any moves their own commits
- I add tests *before* the feature (passing, showing the old behavior)
- The actual fix or feature commit is tiny, with the diff of the tests just demontstrating how behavior changed
This makes it really fast to review a PR because you can see the motivation while only looking at a small change. No jumping around trying to figure out how the pieces fit together.
The main reason I might split up a PR like that is if one piece is likely to generate a discussion before its merged. I then make that a separate PR and as small as possible so we can have a focused discussion.
I hate squash merges.
I'll also add one more to your list: Any improvements that came out of the review but stayed in that merge should each be individual commits. I've seen hard-to-trigger bugs get introduced from what should have been just a style improvement.
So as a rule, I tend to stick with "1 PR == 1 commit", except when there's a compelling reason not to.
This is why, contra the linked article about commit messages, I strive to make minimal and cohesive commits with good messages. Commits are for future archaeology, not just a way to save my work every minute in case my hard drive dies.
In ~18 years of git use, I have never needed it, but I see it mentioned often as an important reason to handle commits in some certain way. I wonder what accounts for the difference.
That's probably the most important case: large complex codebases with lots of changes where "wtf is going on" isn't so obvious from just the code.
I've never used it for any of my personal projects or even at my dayjob, because they've much smaller with far fewer changes (relatively speaking).
Having said that, bisect is also an O(log N) method and it's useful where otherwise you might end up spending O(N) time debugging something. I have myself split a configuration change into many stupidly-small commits (locally, without the intention to push that) purely so I could run bisect instead of manually reviewing the change to figure out which part broke stuff.
One case where git bisect really saved me was when we discovered a really subtle bug in an embedded project I was working on. I was able to write a 20 line shell script that built the code, flashed the device, ran 100 iterations of a check, and do some basic stats. I left the bisect chugging away over the weekend and came back to an innocuous-looking commit from 6 months earlier. We could've spent weeks trying to find the root cause without it.
Heck, I used it yesterday because I had a PR where I was cleaning things in the C++ build system where things stopped working in CI in weird ways I couldn’t figure out but was fine locally. I used bisect locally to figure out which commits to test. You just have to think that a blind bisect search is going to be more effective than trying to spot check the commits that are a problem (and for tricky bugs this is often the case because your intuition can mislead you).
I’ve also used it to find weird performance regressions I couldn’t figure out.
Bisects are good when the bug is reproducible, you have a "this used to work and now it doesnt" situatiuon, and the code base is too big or too unfamiliar for you to have a good intuition about where to look. You can pretty quickly get to the commit where the bug first appears, and then look at what changed in that commit.
There was one git-using startup I worked for which had a merge-recklessly development style, and there was one occasion when I could have used `git bisect` to disprove my coworker's accusation that I had thoughtlessly broken the build (it was him, actually, and, yuck - what a toxic work environment!), but the commit in question was like HEAD~3 so it would probably have taken me longer to use git bisect than to just test it by hand.
You _never_ have bugs that slip through the test suite? That is extremely impressive / borderline impossible. Even highly tested gold-standard projects like SQLite see bugs in production.
And once you find a regression that wasn't covered by your test suite, the fastest way to find where it originated is often git bisect.
You're skeptical of a far stronger claim than the one they actually made.
It's the same for me, but some integration bugs still escape the notice of unit tests. Examples from memory: a specific subset users being thrown into an endless redirect due to a cookie rename that wasn't propagated across all sub-systems on the backend, multiple instances of run-time errors that resulted from dependency version mismatches (dynamic loading), and a new notification banner element covering critical UI elements on an infrequently used page - effectively conflicting CSS position. In all these cases, the CI tests were passing, but passing tests don't mean your software is working as expected in all cases.
In the cases you mentioned, robust e2e and integration tests would ideally be able to catch the bugs. And for the UI issue in particular, I wouldn't think to track down the commit that caused it, but just fix it and move on.
And the few times I've reached for it, I was really thankful it was there.
I also tried to make a minimal reproduction but wasn’t able to.
You're bisecting the history of PR merges.
I responded to the first case. Of course I ignored a claim you made about the second case. If I didn't ignore that, I would be making a strawman out of what you said, mixing up your words in a way that doesn't make sense.
- The real unit of change lives in Git
- The real unit of change lives on some forge
I want it to live in Git.
Respectfully, that's the dumbest thing I've ever heard.
Work on your feature until it's done, on a branch. And then when you're ready, have the branch reviewed. Then squash the branch when merging it, so it becomes 1 commit.
Commit and push often, on branches, and squash when merging. Don't review commits, review the PR.
I've had people at various jobs accidentally delete a directory, effectively losing all their progress, sometimes weeks worth of work. I've experienced laptops being stolen.
If I used your system, over the years me and various colleagues would have lost work irretrievably a few times now, potentially sinking a startup due to not hitting a deadline.
I feel your approach shows a very "Nothing bad will ever happen" attitude.
Yes, of course you should have a backup. Most of those don't run every few minutes, though. Or even every few hours.
"Just trust the backup" feels like a really overkill solution for a system that has, as a core feature, pushing to a remote server. And frankly, a way to justify not using the feature.
They made it pretty clear they're talking about not-large commits. And they're contrasting that with any-size PRs.
But you also say it's pure dogma?
I'm confused.
Because this is exactly what a squash merged PR is. There is no meaningful difference unless you say "but commits are done by good people and PRs are done by bad people".
The preference they have is not exactly a problem with github PRs, but github PRs are much more likely to review a big pile of code at once.
The amount of code being reviewed at once is a meaningful and extremely objective measure, and that's the thing they're concerned with. Not who made it.
I probably wont be able to respond to any more comments. Dang put a slowban on my account because he interpreted one of my comments as right wing.
When you're looking back at why something was done in a certain way, the review view is more useful than either the squashed view or the stream-of-work view. A human put effort into making it understandable, so it's no surprise that it's more understandable.
I'm talking about a forge that allows some sort of persistent reviewable unit that can change over time. Phabricator revisions, jj changes, and at least Gerrit has the same thing. There isn't a single unit of review, there are two: the bug and the individual changes. The bug is associated with a stack of changes. An individual change initially corresponds to a commit, but when you rename a variable, you update the commit for that change. jj and I guess git call that squashing, hg calls it amending.
The author does work, useful work, to break down everything that needs to change for that bug into a series of reviewable changes, preferably not breaking the build or tests in the middle of the series, but that's negotiable.
So we may not be disagreeing on anything. If by "commit" you mean one item in a patch stack, then yes we squash. But we do not squash the different changes within a bug, whether or not they change during review. If there's a change that does some refactoring to prepare, then a change to add a new API, then a change to add users and tests of those users, then we review those separately and do not squash for landing.
It is definitely my preferred way of working. I don't want to see a dozen fixup commits, nor do I want to see a giant change that touches everything at once. It's a happy middle ground where the author decides what the reviewer and later debuggers need to look at and what they can skip.
But even if they're lying about achieving it, the preference they have for reviewing small commits is a preference that makes sense. It's not some nonsense "us versus them" thing.
`lazyjj` [1] makes it easier to navigate around the change log (aka commit history) with single keypresses. The only workflow it's currently missing for me is `split`.
For the times when I have had to push to a shared git repo, I used the same technique mentioned in the article to prevent making changes to other developer's commits [2].
It's been a seamless transition for me, and I intend to use Jujutsu for years.
[1] https://github.com/Cretezy/lazyjj [2] https://jj-vcs.github.io/jj/latest/config/#set-of-immutable-...
When my then-employer first stated using git, I managed to convince them to set up Gerrit -- I had my own personal Gerrit instance running at home, it was great. I think it lasted about a year before the popularity factor kicked in and we switched to GitLab.
At least part of the difficulty is that approximately everyone who would need to learn how to use a new mechanism is already familiar with PRs. And folk new to the industry learn pretty quickly that it's normal and "easy". The cultural inertia is immense, and that means we're much more likely to evolve our way out of the hole (or switch to a completely different paradigm) than we are to en-mass switch to a different way of working that's still Git.
There are ways to make the PR experience more reasonable; GitHub has introduced stacked PRs which queue up. Even without that, I find disabling merge commits puts my team in a sweet spot where even if I'd prefer Gerrit, I'm not tearing my hair out and neither are the rest of my team who (mostly) don't care all that much.
I think that MR is better for smaller projects i.e. ~10devs - it's lower overhead, just commit while you work then write up a description when you push.
I think rebase-CP is better for larger projects like ~100 devs committing to a repo - linear git history and every commit having a clear description+purpose+review is worth the overhead at that point.
So one-off tools and infra and stuff get chucked into gitlab while "the product" is in Gerrit.
Why doesn't Github just flatten/squash the stack of commits visually? Like I don't care what they're doing inside .git, can't they just diff the latest commit against the base commit and display that? So it's visually merged even if it's not merged in git's log?
---
At my work we do single commit. I find it annoying to work that way, I sometimes try to make clones of certain commits so I can restore to that point if I need to, but for reviewing, having everything in one neat little bundle, it's nice.
I missed that. How does it work?
But it seems like it gives you a place in a queue, meaning you don't need to keep rebasing as earlier PRs get merged: you can see what the state of the main branch should be by the time your PR reaches the head of the queue and develop accordingly.
https://docs.github.com/en/repositories/configuring-branches...
For something as fundamental as source control in this day and age, I’d go with the former open source option (and have recently been learning Jujutsu)…
In JJ, however, changes are explicitly mutable, which means if you finish a commit and then realise you want to go can and change something, you've got multiple ways to squash your fix directly into the original commit. In addition to this, you also have a local history of every change you make to your project (including rewriting existing commits) so you still never lose any data locally. However, when you finish adjusting your commits and push them to someone else's machine (e.g. GitHub), they will only get the finished commits, and not the full history of how you developed them.
I know some Fossil people feel very strongly that any sort of commit rewriting should be banned and avoided at all costs, because it creates a false history - you can use Jujutsu like that, but it probably doesn't add much over Fossil at the point. But in practice, I think Jujutsu strikes the right balance between allowing you to craft the perfect patch/PR to get a clean history, and preventing you from losing any data while you do that.
how you deal with accidental committing of text that was not supposed to be there (say, accidental pasting a private email)?
JJ itself uses Github, where as fossil is very easy to self-host including issue tracking, wiki etc.
In some ways, this means jj has less features than git. For example, jj doesn't have an index. But that's not because you can't do what the index does for you, the index is just a commit, like any other. This means you can bring to bear your full set of tools for working on commits to the index, rather than it being its own special feature. It also means that commands don't need certain "and do this to the index" flags, making them simpler overall, while retaining the same power.
I wish Github would just allow you to say "this PR depends on this other PR", and then it wouldn't show commits from the other PR when reviewing the stacked PR.
But as far as I know you can't do that.
Is that what you want?
It's a hacky workaround at best. I want proper support. I want Github to disable the merge button until the dependencies are merged. I want it to actually list the dependencies and link to them. I want the target branch to be `main`.
I think if they did those 3 things it would go 90% of the way. Too much to ask though clearly.
I really wish 1PR = 1 commit with revision based diffs. Similar to gerrit.
I'm not sure you'd call it "automatic" though. Were you thinking of using an LLM to split the commits with some semantic awareness? It doesn't do that!
Does this mean that you have to proactively remember and undo any config file changes you made e.g. while fixing an issue in a test environment? Sounds a little risky.
For example, I recently had a set up that looked something like this:
@ w - the checked out change where I'm editing files
|
* x - local config change
|
* y (bookmark 1) - CI config change that sets up a temp test env
|
* z (bookmark 2) - Start of the changes that will eventually get reviewed and merged
With this set up, I could make changes in an environment that included all of the config changes I needed. Then when I was happy with something and wanted to "save my work", I could run `jj squash --to z`, which would squash everything in change w into change z, rebasing everything else automatically on top of that. Then I could run `jj git push`, and this force-pushed the changes at y and z to their separate branches on GitHub. I had a pull request for the branch at z which other people could then review and where all the tests could run. Meanwhile the branch at y had updated the CI config to remove the usual tests and deploy everything to a temporary environment. So each push automatically updated my test env, and updated the pull request with the "clean" changes (i.e. without my config file fiddling).If I wanted this sort of setup more long term, I'd find some other way to achieve it without relying on Jujutsu, but for ad-hoc cases like this it's absolutely great.
what you should have is support for local, gitignore-able config files
How did t end up after u?
I'd expect that to fork into (s -> t) and (s -> u -> v). Either that or maybe (s -> t -> v) and (s -> u).
As for "why", I think it behaves different from what you expected in two ways. The first is that `--revision` rebased only the specified revisions without theirs descendants (there's also `--source` if you want to include the descendants). The other things it that `--after` is a short form of `--insert-after`, so it's designed for inserting commits between other commits. There's also `--before/--insert-before`. There's also `--destination` which is what you expected (perhaps `--onto` would have been a better name for it). You can pass multiple commits to all these arguments, btw (yes, that includes `--destination`, so you can create new merge commits with `jj rebase`). https://jj-vcs.github.io/jj/latest/cli-reference/#jj-rebase has many examples.
That sounds like 2 operations, not 1. I think that's why I was confused.
The docs do clarify that there's an extra rebase going on though, so thanks!
Yes, -d sounds like what I expected.
The basic workflow is fine. And there are some very powerful features. But then you try find the parent of a given branch and you're left wondering how the f#!@ thats so hard.
It's definitely nit picking. It's probably 85-90% of what you want it to be. But there is definitely enough room for improvement to warrent moving beyond it. I think the workflows and integratoins (github, gitlab, etc.) make it stickier as well. I just dont think we should assume everyone is just going to use git for the next 20+ years.
Both relate to the same fundamental problem/feature that a git commit is a hash of its contents, including previous content (via the parent commit, who's hash is part of the input for the child commit hash).
Would be nice, if Codespaces keeps the JJ change stored somewhere, so it isn't tied to the lifetime of the machine.
I often see programming as going MMA with data, until you get what you want out of it.
Using brute force algorithms is going berserk, vibe coding is spamming summons in hope they get the job done, etc.
I've settled on using Interactive Git Log in VSCode.
I feel like people should just learn to use git.
Among the hard work on the jj maintainers' plate is making it just as attractive and intuitive to people without that background.
I’ve been using git for 15+ years,and definitely share the sentiment that basically everyone needs to learn it well as it is just that entrenched.
…but I love that people are coming up with potential alternatives that have a meaningful chance of moving the whole VCS space forward.
Why not just merge it into git?
And I'm sure there are some people who actually prefer Sapling or raw git.
and deal with more abstractions layers if anything goes wrong
> It’s clear from multiple posts over years that jj has a better UX than plain git.
it is clear that multiple people like it
not that UX is strictly superior