Rebase Supremacy

JPDev@programming.dev to Programmer Humor@programming.dev – 1088 points –
225

I think this is a fake quote that somebody made up for an Internet comedy bit, since it seems unlikely for Hollywood actress Sydney Sweeney to have such uncharacteristically strong opinion on software version control, of all things.

Because she of all people would know that there isn't anything wrong with using git merge, and it ultimately comes down to personal preference to what you are used to.

Fair point, Margot Robbie

That's esteemed Academy Award nominated character actress Margot Robbie to you!

And successful Hollywood film producer -- props on getting into the stakeholder end of the business so early in your career!

Margot Robbie, I was about to agree with you and thought that was a very reasonable take, until you tried to argue that git merge is better than git rebase, then I simply had to disregard the whole thing.

This is why Sydney Sweeney isn't on Lemmy.

She probably is, just anonymous. It would be crazy to expect anyone to post on lemmy under their real name.

But they were arguing that it's personal preference not that one is better than the other

"Don't always trust what you read on the internet."

  • Benjamin Franklin

Wait a second, there wasn't even any social media sites back when Benjamin Franklin lived. Did he write that in his newsletter or something?

I think he was a senior contributor for the underground cracker mag 1600 back in the late 80s.

They called em zines.

But esteemed Academy Award nominated character actress and film director, Margot Robbie, if it's unlikely that Hollywood actress Sydney Sweeney said this... wouldn't it be just as unlikely that Margot Robbie would be here? Adding her own comment?

... are you projecting? Is there something you want to tell us esteemed Academy Award nominated character actress and film director Margot Robbie?

I think this is a fake quote that somebody made up for an Internet comedy bit

You can tell by the pixels

I mean, it’s posted in programming humor so yeah.

A bit of the old on the internet no-one knows you're a dog, I think. Therefore I am a webdev dog too.

Why is anyone using X in 2024?

I do, I have yet to switch to Wayland

Shit, I unironically thought they were talking about this, and I unironically haven’t switched because Barrier is broken on Wayland

I use X because Cinnamon on Wayland has no option to change the keyboard layout

I tried their experimental Wayland session and it's still super buggy on high refresh rate/high DPI screens (loads of graphical errors and artifacts) so still a ways to go imo

Wayland really doesn't like RDP/remote access, so X is the only way to go if you want that to work properly.

I actually never had an issue on my wayland system. I used remmina for rdp but never had an issue.

I tried it for a moment, made games stutter like hell, switched back. I know I need to go in and figure it out at some point, but it's hard to muster the energy when X, for the most part, works fine.

From what I've seen, it probably has to do with my Nvidia GPU.

Odd, when i switched to Wayland the stuttering stopped! Also on Nvidia

On nvidia, it definitely feels much smoother, but some GPU accelerated programs like games become flickery, i think it's an xwayland issue

Depends on a few things with your setup: age of your GPU, the resolution/refresh rate of your monitor. I think even the choice of DP/HDMI can have an impact too

I'm currently at the point where I blame everything that works on my Laptop but not on my PC on Nvidia, because that's literally the biggest difference between those two. Like currently my getty isn't displaying properly, which is surely NVidias fault.

Because my new intel integrated graphics cause Wayland to run like a slideshow.

because some of the hardware I use is too old to run Wayland

Idk I hang out on Mastodon.social

If yall don't like the chronological feed then find a hashtag you like and start following people.

4 more...

I prefer to rebase as well. But when you're working with a team of amateurs who don't know how to use a VCS properly and never update their branc with the parent branch, you end up with lots of conflicts.

I find that for managing conflicts, rebase is very difficult as you have to resolve conflicts for every commit. You can either use rerere to repeat the conflict resolution automatically, or you can squash everything. But when you're dealing with a team of Git-illiterate developers (which is VERY often the case) you can either spend the time to educate them and still risk having problems because they don't give a shit, or you can just do a regular merge and go on with your life.

Those are my two cents, speaking from experience.

I agree that merge is the easier strategy with amateurs. By amateurs I mean those who cannot be bothered to learn about rebase. But what you really lose there is a nice commit history. It's good to have, even if your primary strategy is merging. And people tend to create horrendous commit histories when they don't know how to edit them.

Honestly, I'm pretty sure 99.9% of git users never really bother with the git history in any way that would be hindered by merging.

Git has a ton of powerful features, but for most projects they don't matter at all. You want a distributed consensus, that's it. Bothering yourself with all those advanced features and trying to learn some esoteric commands is frankly just overhead. Yes, you can solve great problems with them, but these problems almost never occur, and if they do, using the stupid tools is faster overall.

@agressivelyPassive @technom That's a self-fulfilling prophecy, IMO. Well-structured commit histories with clear descriptions can be a godsend for spelunking through old code and trying to work out why a change was made. That is the actual point, after all - the Linux kernel project, which is what git was originally built to manage, is fastidious about this. Most projects don't need that level of hygiene, but they can still benefit from taking lessons from it.

To that end, sure, git can be arcane at the best of times and a lot of the tools aren't strictly necessary, but they're very useful for managing that history.

Yup, once you can use git with good hygiene, it opens up the door to add in other tools like commitizen and semantic-release, which completely automates things like version number bumps and changelog generation.

I fucking hate auto-generated changelogs, so I consider that a downside.

I'd still argue, that the overhead is not worth it most of the time.

Linux is one of the largest single pieces of software in existence, of course it has different needs than the standard business crap the vast majority of us develop.

To keep your analogy: not every room is an operating room, you might have some theoretical advantages from keeping your kitchen as clean as an OR, but it's probably not worth the hassle.

To keep your analogy, most people's git histories, when using a merge-based workflow, is the equivalent of never cleaning the kitchen, ever.

No, it's not. And you know that.

Seriously, ask yourself, how often did the need arise to look into old commits and if it did, wasn't the underlying issue caused by the processes around it? I've been in the industry for a few years now and I can literally count on one hand how often I had to actually look at commit history for more than maybe 10 commits back. And I spend maybe 10min per year on that on average, if at all.

I honestly don't see a use case that would justify the overhead. It's always just "but what if X, then you'd save hours!" But X never happens or X is caused by a fucked up process somewhere else and git is just the hammer to nail down every problem.

Seriously, ask yourself, how often did the need arise to look into old commits

Literally every single day. I have a git alias that prints out the commit graph for my repositories, and by looking at that I can instantly see what tasks my coworkers are working on, what their progress is, and what their work is based on. It's way more useful than any stand-up meeting I've ever attended.

I've been in the industry for a few years now and I can literally count on one hand how often I had to actually look at commit history for more than maybe 10 commits back.

I've been in the industry for nearly 15 years, but I can say that the last 3 years have been my most productive, and I attribute a lot of that to the fact that I'm on a team that cares about git history, knows how to use it, and keeps it readable. Like other people have been saying, this is a self fulfilling prophecy - most people don't care to keep their git history readable, so they've only ever seen unreadable git histories, and so they think git history is useless.

I honestly don't see a use case that would justify the overhead.

What overhead? The learning curve on rebasing isn't that much steeper than that of merging or just using git itself. Take an hour to read the git docs, watch a tutorial or two, and you're good to go. Understand that people actually read your commit messages and take 15 extra seconds to make them actually useful. Take an extra minute before opening an MR to rebase your personal branches interactively and squash down the "fixed a typo" and "ran isort" commits into something that's actually useful. In the long run this saves time by making your code more easily reviewable, and giving reviewers useful context around your changes.

It's always just "but what if X, then you'd save hours!" But X never happens or X is caused by a fucked up process somewhere else and git is just the hammer to nail down every problem.

No, having a clean, readable git history literally saves my team hours. I haven't had to manually write or edit a changelog in three years because we generate it automatically from our commit messages. I haven't had to think about a version number in three years because they're automatically calculated from our commit messages. Those are the types of things teams sink weeks into, time absolutely wasted spent arguing over whether this thing or that is a patch bump or a minor bump, and no one can say for sure without looking at diffs or spinning up multiple versions of the code and poking it manually, because the git log is a tangled mess of spaghetti with meatballs made of messages like "finally fixed the thing" and "please just work dammit". My team can tell you those things instantly just by looking at the git log. Because we care about history, and we keep it clean and useable.

I gotta say, I was with you for most of this thread, but looking through old commits is definitely something that I do on a regular basis... Like not even just because of problems, but because that's part of how I figure out what's going on.

The whole reason I keep my git history clean and my commit messages thoughtful is so that future-me (or future-someone-else) will have an easier time walking through it later, because that happens all the time.

I'll still almost always choose merge instead of rebase, but not because I don't care about the git history-- quite the opposite, it's really important to me in a very practical way.

We use history and blame a lot

Only users who don't know rebasing and the advantages of a crafted history make statements like this. There are several projects that depend on clean commit history. You need it for conventional commit tools (like commitzen), pre-commit hook tools, git blame, git bisect, etc.

Uuuh, am I no true Scotsman?

Counter argument: why do you keep fucking up so bad you need these tools? Only users who are bad at programming need these. Makes about as much sense as your accusation.

You keep iterating the same arguments as the rest here, and I still adhere to my statement above: hardly anybody needs those tools. I literally never used pre-commit hooks or bisect in any semi-professional context. And I don't know a single project that uses them. And before you counter with another "well u stoopid then" comment: the projects I've been working on were with pretty reputable companies and handled literally billions of Euros every year. I can honestly say, that pretty much everyone living in Germany had his/her data pushed through code that I wrote.

Uuuh, am I no true Scotsman?

That's a terrible and disingenuous take. I'm saying that you won't understand why it's useful till you've used it. Spinning that as no true Scotsman fallacy is just indicative of that ignorance.

You keep iterating the same arguments as the rest here, and I still adhere to my statement above: hardly anybody needs those tools.

And you keep repeating that falsehood. Isn't that the real no true Scotsman fallacy? How do you even pretend to know that nobody needs it? You can't talk for everyone else. Those who use it find it useful in several other ways that I and others have explained. You can't just judge it away from your position of ignorance.

Why would you want to edit your commit history? When I need to look at it for some reason, I want to see what actually happened, not a fictional story.

Because when debugging, you typically don't care about the details of wip, some more stuff, Merge remote-tracking branch 'origin/master', almost working, Merge remote-tracking branch 'origin/master', fix some tests etc. and would rather follow logical steps being taken in order with descriptive messages such as component: refactor xyz in preparation for feature, component: add do_foo(), component: implement feature using do_foo() etc.

You can have both. I'll get to that later. But first, let me explain why edited history is useful.

Unedited histories are very chaotic and often contains errors, commits with partial features, abandoned code, reverted code, out-of-sequence code, etc. These are useful in preserving the actual progress of your own thought. But such histories are a nightmare to review. Commits should be complete (a single commit contains a full feature) and in proper order. If you're a reviewer, you also wouldn't want to waste time reviewing someone else's mistakes, experiments, reverted code, etc. Self-complete commits also have another advantage - users can choose to omit an entire feature by omitting a commit.

Now the part about having both - the unedited and carefully crafted history. Rebasing doesn't erase the original branch. You can preserve it by creating a new branch. Or, you can recover it from reflog. I use it to preserve the original development history. Then I submit the edited/crafted history/branch upstream.

How others are keeping their branches up to date is their problem. If you use Gitlab you can set up squash policy for merge requests. All the abomination they’ve caused in their branch will turn into one nice commit to the main branch.

In a small team at a small company it becomes my problem pretty quickly, since I'm the only one that actually has some clue about what git does.

This. When they get any sort of conflicts in their pull request, it becomes MY problem because they don't know what to do.

Heaven forbid my teammates read any documentation or make any attempt to understand the tooling necessary to do their job.

That being said, I taught my dumbass git-illiterate team members a rebase workflow, with the help of the git UI in Pycharm. Haven't had any issues with merge conflicts yet, but that might just be because they're too scared to ask me for help any more

I don't want squashed commits. It makes git tools worse (git bisect, git cherry-pick, etc.) and I work very hard to craft a meaningful set of commits for my work and I don't want to throw all of that away.

But yeah, I don't actually give a shit what they are doing on their branches. I regularly rebase onto master anyway.

Please for the love of god don't use merge, especially in a crowded repository. Don't be me and suffer the consequences. I mistakenly mention every person with a commit between the time I created the branch until current master.

That was you! I remember this.

AAAH NOT LIKE THIS

There's 102 people mentioned in that commit and two of them happen to meet in the comments of a meme thread on Lemmy of all places. I love the Internet.

Could have been worse. I mean, like, imagine of you were using like CVS and you put a watch on the root! Haha and then like every trivial commit in the repo caused everyone to in the entire org to get an email and it crashed the email servers.

Like who'd even DO that?! Though, I bet if you met that guy he'd be ok. Like not a jerk, and pretty sorry for all those emails. A cool guy.

Merge is not the issue here, rebase would do the same.

really? how come? I thought they are mentioned because of the diffs if compared to master, which merge basically just... merge on top of my branch (?)

They were mentioned because a file they are the code owner of was modified in the PR.

The modifications came from another branch which you accidentally(?) merged into yours. The problem is that those commits weren't in master yet, so GH considers them to be part of the changeset of your branch. If they were in master already, GH would only consider the merge commit itself part of the change set and it does not contain any changes itself (unless you resolved a conflict).

If you had rebased atop of the other branch, you would have still had the commits of the other branch in your changeset; it'd be as if you tried to merge the other branch into master + your changes.

Just for the record, I think you're conflating git and GitHub. They are not the same thing, even if GH would like you to think so.

I am not. Read the context mate.

You sent over twenty-two thousand notifications lmao.

And then the bot added about as many tags to the PR.

ITT: people who have no idea how rebasing works.

Skill issue tbh

No doubt. git rebase is like a very sharp knife. In the right hands, it can accomplish great things, but in the wrong hands, it can also spell disaster.

As someone who HAS used it a fair amount, I generally don't even recommend it to people unless they're already VERY comfortable with the rest of git and ideally have some sense of how it works internally.

Yeah it is something people should take time to learn. I do think its "dangers" are pretty overstated, though, especially if you always do git rebase --interactive, since if anything goes wrong, you can easily get out with git rebase --abort.

In general there's a pretty weird fear that you can fuck up git to the point at which you can't recover. Basically the only time that's really actually true is if you somehow lose uncommitted work in your working tree. But if you've actually committed everything (and you should always commit everything before trying any destructive operations), you can pretty much always get back to where you were. Commits are never actually lost.

True, the real danger is using git reset with the --hard flag when you haven't committed your changes lol

You can get in some pretty serious messes, though. Any workflow that involves force-pushing or rebasing has the potential for data loss... Either in a literally destructive way, or in a "Seriously my keys must be somewhere but I have no idea where" kind of way.

When most people talk about rebase (for example) being reversible, what they're usually saying is "you can always reverse the operation in the reflog." Well yes, but the reflog is local, so if Alice messes something up with her rebase-force-push and realizes she destroyed some of Bob's changes, Alice can't recover Bob's changes from her machine-- She needs to collaborate with Bob to recover them.

Pretty much everything that can act as a git remote (GitHub, gitlab, etc.) records the activity on a branch and makes it easy to see what the commit sha was before a force push.

But it's a pretty moot point since no one that argues in favor of rebasing is suggesting you use it on shared branches. That's not what it's for. It's for your own feature branches as you work, in which case there is indeed very little risk of any kind of loss.

Ah, you've never worked somewhere where people regularly rebase and force-push to master. Lucky :)

I have no issue with rebasing on a local branch that no other repository knows about yet. I think that's great. As soon as the code leaves local though, things proceed at least to "exercise caution." If the branch is actively shared (like master, or a release branch if that's a thing, or a branch where people are collaborating), IMO rebasing is more of a footgun than it's worth.

You can mitigate that with good processes and well-informed engineers, but that's kinda true of all sorts of dubious ideas.

Pushing to master in general is disabled by policy on the forge itself at every place I've worked. That's pretty standard practice. There's no good reason to leave the ability to push to master on.

There's no reason to avoid force pushing a rebased version of your local feature branch to the remote version of your feature branch, since no one else should be touching that branch. I literally do this at least once a day, sometimes more. It's a good practice that empowers you to craft a high-quality set of commits before merging into master. Doing this avoids the countless garbage fix typo commits (and spurious merge commits) that you'd have otherwise, making both reviews easier and giving you a higher-quality, more useful history after merge.

Tbh, we just don't hire people if they can't operate git.

Okay this is the second time I've seen Sydney Sweeney referenced in a meme in less than half a day. I had never heard of her before. Who is she, and why is she suddenly attracting so much meme attention?

She's an Australian American actress who blew up last year (she was in euphoria I think?), expect to see her in a ton of upcoming blockbusters.

She is from Spokane Washington not Australia. She got a lot of recognition for her role in Euphoria and is blowing up a bit right now because she is a young, attractive, talented actress.

She was excellent in the first season of White Lotus on Sky too. Great show.

Her blown up remains you mean?

Yeah, I had a feeling that was not the best way to put it but I was in a hurry :)

She's an American actress who was in Handmaid's Tale and a few other things and then blew up thanks to her role in Euphoria. She's become a bit of a meme recently because online conservatives think that her boobs are anti-woke

I've been using merge, and I hate that I don't even know what rebase really does

Merge takes two commits and smooshes them together at their current state, and may require one commit to reconcile changes. Rebase takes a whole branch and moves it, as if you started working on it from a more recent base commit, and will ask you to reconcile changes as it replays history.

This diagram seems wrong to me. Isn't the second image a squash merge? Also why would rebasing a feature branch change main?

Yeah, the image (not mine, but the best I found quickly) kinda shows a rebase+merge as the third image. As the other commenter mentioned, the new commit in the second image is the merge commit that would include any conflict resolutions.

why would rebasing a feature branch change main?

the image does not update the feature branch. It merges the featurebranch into main with a regular old merge-commit on the main branch.

The only difference between a *rebase-merge and a rebase is whether main is reset to it or not. If you kept the main branch label on D and added a feature branch label on G', that would be what @andrew@lemmy.stuart.fun meant.

That's pretty cool, might actually do that. Tho, we currently don't use the history as much anyways, we're just having a couple of small student projects with the biggest group being 6 people. I guess it's more useful if you're actually making a real product in a huge project that has a large team behind it

Just remember to not combine it with force push or you're in for some chaos (rewriting history team members have already fetched is a big no-no).

Facts. Force push belongs in Star Wars, and nowhere else.

Or, you know, on your own feature branch to clean up your own commits. It's much, much better than constantly littering your branch's history with useless merge commits from upstream, and it lets you craft a high-quality, logical commit history.

Of course it has its uses. I didn't mention them because the guy just learned about rebase - it's unlikely to be applied flawlessly from the start.

I was replying to the other comment, not yours. Though there's not really a way of using rebasing without force pushing unless it's a no-op.

Rebasing is really not a big deal. It's not actually hard to go back to where you were, especially if you're using git rebase --interactive. For whatever reason people don't seem to get that commits aren't actually ever lost and it's not that hard to point HEAD back to some previous commit.

I was replying to the other comment, not yours.

I know. Answered anyway because I thought of the same thing as you.

Though there's not really a way of using rebasing without force pushing unless it's a no-op.

I like to rebase after fetching and before pushing. IMO that's the most sensible way to use it even in teams that generally prefer merge. It's also not obvious to beginners since pull is defaulted to fetch+merge.

Ah gotcha.

like to rebase after fetching and before pushing. IMO that's the most sensible way to use it even in teams that generally prefer merge.

What do you mean? Like not pushing at all until you're making the MR? Because if the branch has ever been pushed before and you rebase, you're gonna need to force push the branch to update it.

Personally I'm constantly rebasing (like many times a day) because I maintain a clean commit history as I develop (small changes to things I did previously get commits and are added to the relevant commit as a fixup during interactive rebasing). I also generally keep a draft MR up with my most recent work (pushing at end of day) so that I can have colleagues take a look at any point if I want to validate anything about the direction I'm taking before continuing further (and so CI can produce various artifacts for me).

It's also not obvious to beginners since pull is defaulted to fetch+merge.

Yeah, pull should definitely be --ff-only by default and it's very unfortunate it isn't. Merging on pull is kind of insane behavior that no one actually wants.

What do you mean? Like not pushing at all until you're making the MR? Because if the branch has ever been pushed before and you rebase, you're gonna need to force push the branch to update it.

Not everyone works in large orgs that require pull requests. We have a dev branch multiple devs push to and just branch off for test phase. So I commit locally (also interactive rebasing when fixing stuff from earlier). When I'm ready to push, I fetch, rebase and push. I never force push here.

Uh, it's definitely a bad idea to be concurrently developing on the same branch for a lot of reasons, large org or not. That's widely considered a bad practice and is just a recipe for trouble. My org isn't that huge, and on our team for our repo we have 9 developers working on it including myself. We still do MRs because that's the industry standard best practice and sidesteps a lot of issues.

Like, how do you even do reviews? Patch files?

You can do all that without force push. Just make a new branch and do the cleanup before the first push there. Allowing force push just invites disaster from junior developers who don't know what they're doing. If you want to clean up after them, that's your business, I guess.

That's exactly the same thing. A branch is nothing more than a commit that you've given a name to. Whether that name is your original branch's name or a new branch's name is irrelevant. The commit would be the same either way.

A junior cannot actually do any real damage or cause any actual issue. Even if they force push "over" previous work (which again, is just pointing their branch to a new commit that doesn't include the previous work),, that work is not lost and it's trivial to point their branch to the good commit they had previously. It's also a good learning opportunity. The only time you actually can lose work is if you throw away uncommitted changes, but force pushing or not is completely irrelevant for that.

Force pushes are perfectly safe if you're working on your own branch, and even if you're sharing a branch, you can still force push to it as long as you inform and coordinate with whoever else is working on that branch.

1 more...

I wouldn't recommend it. The Git documentation itself doesn't recommend rebase for more than moving a few unpushed commits to the front of a branch you are updating. Using it by default instead of merge requires you to use --force-push as part of your workflow which can lead to confusing situations when multiple developers end up commiting to the same branch, and at worst can lead to catastrophic data loss. The only benefit is a cleaner history graph, which is rarely used anyway, and you can always make the history graph easier to read with a gui without incuring any of the problems of rebase.

Bad take IMO,

At 10+ YOE, I use rebase almost exclusively. Branch from main, rebase to clean up commit history before putting up a PR. If commits are curated properly you don't run into conflicts very often. Branches really shouldn't be shared too often anyway, and the ones that are should be write protected.

Catastrophic data loss isn't really possible either with git since it's all preserved and you can git reflog even if you mess up.

The meme is right. Git good

When rebasing, it applies the changes without the commit history?

Does that mean that when you fast forward your main/dev branch and commit, you then add a single commit that encompasses every changes that were rebase?

No, there are no fast-forwards with rebasing. A rebase will take take the diff of each commit on your feature branch that has diverged from master and apply those each in turn, creating new commits for each one. The end result is that you have a linear history as though you had branched from master and made your commits just now.

If you had branched like this:

A -> B -> C (master)
   \
     \ -> D (feature)

It would like this after merging master into your feature branch:

A -> B -> C (master) ->   E (feature)
  \                                    /
    \ -> D -------------------> /

And it would like this if you instead rebased your feature branch onto master:

A -> B -> C (master) -> D' (feature)

This is why it's called a "rebase": the current state of master becomes the starting point or "base" for all of your subsequent commits. Assuming no conflicts, the diff between A and D is the same as the diff between A and D'.

Years of experience don't really matter here, that's just call to authority, in this case yourself. You might as well be the worst git user ever after 20 years of usage, or the best after 2. We don't know that.

Anyway, what you're saying basically requires a perfect world to be true. Feature branch flow is perfectly fine, but you do end up with merge conflicts constantly, unless you have cordoned off areas of the repo for certain users. Two people working on unrelated features, both change a signature of some helper/util method, merge conflict. Nothing serious, can be fixed in a minute, and rebasing or merging won't help for either.

Merge is perfectly fine. And arguing about which strategy to use is one of those autistic debates we as an industry seemingly love to have. It doesn't matter, but you'll find people screaming at each other about it. See Emacs vs. Vi. Same crap.

Merge is fine, but not knowing both rebase and merge is dumb. And I guess I've been in a perfect world this whole time in huge technical orgs lol.

This a really bad take and fundamentally misunderstands rebasing.

First off, developers should never be committing to the same branch. Each developer maintains their own branch. Work that needs to be tested together before merging to master belongs on a dedicated integration branch that each developer merges their respective features branches into. This is pretty standard stuff.

You don't use rebasing on shared branches, and no one arguing for rebasing is suggesting you do that. The only exception might be perhaps a dedicated release manager preparing a release or a merge of a long-running shared branch. But that is the kind of thing that's communicated and coordinated.

Rebasing is for a single developer working on a feature branch to produce a clean history of their own changes. Rebasing in this fashion doesn't touch any commits other than the author's. The purpose is to craft a high quality history that walks a reader through a proposed sequence of logical, coherent changes.

Contrary to your claim, a clean history is incredibly valuable. There's many tools in git that benefit significantly from clean, well-organizes commits. git bisect, git cherry-pick... Pretty much any command that wants to pluck commits from history for some reason. Or even stuff like git log -L or git blame are far more useful when the commit referenced is not some giant amalgamation of changes from all over the place.

When working on a feature branch, if you're merging upstream into your branch, you're littering your history with pointless, noisy commits and making your MR harder to review, in addition to making your project's history harder to understand and navigate.

1 more...

I'm relatively new to git and rebase looks like a mess to me? Like it appears to be making duplicate commits and destroys the proper history?

If you use rebase to get a more readable history, isn't the issue the tool you use to view the history?

I guess I have to try it out a few times to get it.

What you probably mean by duplicate commits is that it assigns new commit IDs to commits that have been rebased. If you had already pushed those commits, then git status will tell you that the remote branch and your local branch have diverged by as many commits as you rebased.

Well, and what is the "proper history"? If your answer is "chronological", then why so?
For the rare times that it matters when exactly a commit was created, they've got a timestamp. But otherwise, the "proper history" is whatever you make the proper history. What matters is that the commits can be applied one after another, which a rebase ensures.

When you're working on a branch and you continuously rebase on the branch you want to eventually merge to, then the merged history will look as if you had checked out the target branch and just made your commits really quickly without anyone else committing anything in between.
And whether you've done your commits really quickly or over the course of weeks, that really shouldn't matter.

What is really cool about (supposedly) making commits really quickly is that your history becomes linear and it tells a comprehensible story. It won't be all kinds of unrelated changes mixed randomly chronologically, but rather related commits following one another.
And of course, you also lose the merge-commits, which convey no valuable information of their own.

you also lose the merge-commits, which convey no valuable information of their own.

In a feature branch workflow, I do not agree. The merge commit denotes the end of a feature branch. Without it, you lose all notion of what was and wasn't part of the same feature branch.

Agreed, you also lose the info about the resolved merge conflicts during the merge (which have been crucial a few times to me).

Well, with a rebase workflow, there should be no merge conflicts during the final merge. That should always be a fast-forward.

Of course, that's because you shift those merge conflicts to occur earlier, during your regular rebases. But since they're much smaller conflicts at a time, they're much easier to resolve correctly, and will often be auto-resolved by Git.

You're still right, that if you've got a long-running feature branch, there's a chance that a conflict resolution broke a feature that got developed early on, and that does become invisible. On the flip-side, though, the person working on that feature-branch has a chance to catch that breakage early on, before the merge happens.

The commits aren't duplicated, but applied to the main branch. Since git has commit ids, they won't be re-rebased either.

1 more...

Merge is taking all the code from the master branch and combining it with the task branch, resulting in a commit for just the merge itself.

Rebase is "re-basing" where your task branch was created from off the master branch. It essentially takes all the commits from master that happened since you branched, REWRITES THE HISTORY of your task branch by inserting those master branch commits before all your existing commits, and effectively makes your task branch look like it was branched yesterday instead of like 4 weeks ago. You changed where your task branch originated on the master. You moved its base.

Atlassian does a fantastic writeup on this.

So kinda like as if you had kept your branch synced the whole time?

Kind of. Both merge and rebase result in the branches "synced up" but they do it in different ways.

Merge is making a batter for cookies, having a bowl for dry ingredients (task branch) and a bowl for wet ingredients, (master branch) making them separately and then just dumping the dry bowl into the wet bowl (merge).

Rebase is taking a time machine back to before you started mixing the dry ingredients, mix all the wet ingredients first then add the dry ones on top of that in the same bowl.

It's really hard to create an analogy for this.

So, with a merge you basically shuffle in the changes from both branches, but a rebase takes only the changes from one branch and puts it over the other? Edit: no. Read wrong. I should probably watch a vid about it or something

1 more...

I know this is a meme post, but can someone succinctly explain rebase vs merge?

I am an amateur trying to learn my tool.

Merge keeps the original timeline. Your commits go in along with anything else that happened relative to the branch you based your work off (probably main). This generates a merge commit.

Rebase will replay all the commits that happened while you were doing your work before your commits happen, and then put yours at the HEAD, so that they are the most recent commits. You have to mitigate any conflicts that impact the same files as these commits are replayed, if any conflicts arise. These are resolved the same way any merge conflict is. There is no frivolous merge commit in this scenario.

TlDR; End result, everything that happened to the branch minus your work, happens. Then your stuff happens after. Much tidy and clean.

Thanks for the explanation. It makes sense. To my untrained eyes, it feels like both merge and rebase have their use. I will try to keep that in mind.

Yes. They do. A lot of people will use vacuous terms like "clean history" when arguing for one over the other. In my opinion, most repositories have larger problems than rebase versus merge. Like commit messages.

Also, remember, even if your team/repository prefers merges over rebases for getting changes into the main branch, that doesn't mean you shouldn't be using rebase locally for various things.

You nailed it with the critique of commit messages. We use gitmoji to convey at-a-glance topic for commits and otherwise adhere to Tim Pope's school of getting to the point

I must have read that blog post in the past because that's exactly the style I use. Much of it is standard though.

One MAJOR pet peeve of mine (and I admit it is just an opinion either way) is when people use lower case letters for the first line of the commit message. They typically argue that it is a sentence fragment so shouldn't be capitalized. My counter is that the start of sentences, even fragmented ones, should be capitalized. Also, and more relevant, is that I view the first line of the commit more like the title of something than a sentence. So I use the Wikipedia style of capitalizing.

Gitmoji?

https://gitmoji.dev/

Quasi parallel reply to your other post, this would kind of echo the want for a capital letter at the start of the commit message. Icon indicates overall topic nature of commits.

Lets say I am adding a database migration and my commit is the migration file and the schema. My commit message might be:

     🗃️ Add notes to Users table

So anyone looking at the eventual pr will see the icon and know that this bunch of work will affect db without all that tedious "reading the code" part of the review, or for team members who didn't participate in reviews.

I was initially hesitant to adopt it but I have very reasonable, younger team mates for whom emojis are part of the standard vocabulary. I gradually came to appreciate and value the ability to convey more context in my commits this way. I'm still guilty of the occasionally overusing:

   ♻️ Fix the thing

type messages when I'm lazy; doesn't fix that bad habit, but I'm generally much happier reading mine or someone else's PR commit summary with this extra bit of context added.

I looked at it and there's a lot of them!

I see things like adding dependencies but I would add the dependency along with the code that's using it so I have that context. Is the Gitmoji way to break your commits up so that it matches a single category?

Yes, that is another benefit, once you start getting muscle memory with the library. You start to parcel things by context a bit more. It's upped my habit of discrete commit-by-hunks, which also serves as a nice self-review of the work.

I don't see that as a benefit tbh - if I have a dependency, I want to see why it's there as part of the commit. I'm imagining running blame on Cargo.toml and seeing "Add feature x" vs "Add dependency". I guess the idea is it's "➕ Add dep y for feature x" but I'd still rather be able to see the related code in the same commit instead of having to find the useful commit in the log.

I suppose you could squash them together later, but then why bother splitting it out in the first place?

I see that some use a subset of Gitmoji and that does make sense to me - after all, you wouldn't use all of them in every project anyway, e.g. 🏷️ types is only relevant for a few languages.

How would rebasing my own branch work? Do I rebase the main into my branch, or make a copy of the main branch and then rebase? I have trouble grasping how that would work.

You're still rebasing your branch onto main (or whatever you originally branched it off of), but you aren't then doing a fast forward merge of main to your branch.

The terminology gets weird. When people say "merge versus rebase" they really mean it in the context of brining changes into main. You (or the remote repository) cannot do this without a merge. People usually mean "merge commit versus rebase with fast forward merge"

Yeah I was confused because you are right, merge is usually refered as the git merge and then git commit.

It makes sense. Thanks for the clarification

Here's an example

Say I work on authentication under feature/auth Monday and get some done. Tuesday an urgent feature request for some logging work comes in and I complete it on feature/logging and merge clean to main. To make sure all my code from Monday will work, I will then switch to feature/auth and then git pull --rebase origin main. Now my auth commits start after the merge commit from the logging pr.

100% they do. Rebase is an everyday thing, merge is for PRs (for me anyway). Or merges are for regular branches if you roll that way. The only wrong answer is the one that causes you to lose commits and have to use reflog, cos....well, then you done messed up now son... (but even then hope lives on!)

Yes. My rule of thumb is that generally rebasing is the better approach, in part because if your commit history is relatively clean then it is easier to merge in changes one commit at a time than all at once. However, sometimes so much has changed that replaying your commits puts you in the position of having to solve so many problems that it is more trouble than it is worth, in which case you should feel no qualms about aborting the rebase (git rebase --abort) and using a merge instead.

I have the bad habit of leaving checkpoints everywhere because of merge squash that I am trying to fix. I think that forcing myself to rebase would help get rid of that habit. And the good thing is that I am the sole FW dev at work, so I can do whatever I want with the repos.

Merge gives an accurate view of the history but tends to be "cluttered" with multiple lines and merge commits. Rebase cleans that up and gives you a simple A->B->C view.

Personally I prefer merge because when I'm tracking down a bug and narrow it down to a specific commit, I get to see what change was made in what context. With rebase commits that change is in there, but it's out of context and cluttered up with zillions of other changes from the inherent merges and squashes that are included in that commit, making it harder to see what was changed and why. The same cluttered history is still in there but it's included in the commits instead of existing separately outside the commits.

I honestly can't see the point of a rebased A->B->C history because (a) it's inaccurate and (b) it makes debugging harder. Maybe I'm missing some major benefit? I'm willing to learn.

I feel the opposite, but for similar logic? Merge is the one that is cluttered up with other merges.

With rebase you get A->B->C for the main branch, and D->E->F for the patch branch, and when submitting to main you get a nice A->B->C->D->E->F and you can find your faulty commit in the D->E->F section.

For merge you end up with this nonsense of mixed commits and merge commits like A->D->B->B'->E->F->C->C' where the ones with the apostrophe are merge commits. And worse, in a git lot there is no clear "D E F" so you don't actually know if A, D or B came from the feature branch, you just know a branch was merged at commit B'. You'd have to try to demangle it by looking at authors and dates.

The final code ought to look the same, but now if you're debugging you can't separate the feature patch from the main path code to see which part was at fault. I always rebase because it's equivalent to checking out the latest changes and re-branching so I'm never behind and the patch is always a unique set of commits.

For merge you end up with this nonsense of mixed commits and merge commits like A->D->B->B’->E->F->C->C’ where the ones with the apostrophe are merge commits.

Your notation does not make sense. You're representing a multi-dimensional thing in one dimension. Of course it's a mess if you do that.

Your example is also missing a crucial fact required when reasoning about merges: The merge base.
Typically a branch is "branched off" from some commit M. D's and A's parent would be M (though there could be any amount of commits between A and M). Since A is "on the main branch", you can conclude that D is part of a "patch branch". It's quite clear if you don't omit this fact.

I also don't understand why your example would have multiple merges.

Here's my example of a main branch with a patch branch; in 2D because merges can't properly be represented in one dimension:

M - A - B - C - C'
  \           /
    D - E - F

The final code ought to look the same, but now if you’re debugging you can’t separate the feature patch from the main path code to see which part was at fault.

If you use a feature branch workflow and your main branch is merged into, you typically want to use first-parent bisects. They're much faster too.

You're right, I'm not representing the merge correctly. I was thinking of having multiple merges because for a long running patch branch you might merge main into the patch branch several times before merging the patch branch into main.

I'm so used to rebasing I forgot there's tools that correctly show all the branching and merges and things.

Idk, I just like rebase's behavior over merge.

The thing is, you can get your cake and eat it too. Rebase your feature branches while in development and then merge them to the main branch when they're done.

I would advocate for using each tool, where it makes sense, to achieve a more intelligible graph. This is what I've been moving towards on my personal projects (am solo). I imagine with any moderately complex group project it becomes very difficult to keep things neat.

In order of expected usage frequency:

  1. Rebase: everything that's not 2 or 3. keep main and feature lines clean.
  2. Merge: ideally, merge should only be used to bring feature branches into main at stable sequence points.
  3. Squash: only use squash to remove history that truly is useless. (creating a bug on a feature branch and then solving it two commits later prior to merge).

History should be viewable from log --all --decorate --oneline --graph; not buried in squash commits.

Folks should make sure the final series of commits in pull requests have atomic changes and that each individual commit works and builds successfully alone. Things like fixup commits with auto squash rebase. THIS WAY you can still narrow it down to one commit regardless of the approach.

I used to only merge. Now I rebase. The repo is set up to require squash and rebase when going to main.

All the garbage "spelled thing wrong" and "ran formatter" commits go away. Main is clean and linear.

Squash and rebase or squash or rebase?

..and? You squash so all your gross "isort" "forgot to commit this file" "WIP but I'm getting lunch" commits can be cleaned up into a single "Add endpoint to allow users to set their blah blah" comment with a nice extended description.

You then rebase so you have a nice linear history with no weird merge commits hanging around.

Okay honest question, when you merge a PR in GitHub and choose the squash commits box is that "rebasing"? Or is that just squashing? Because it seems that achieves the same thing you're talking about.

There's two options in the green button on a pr. One is squash and merge, the other is squash and rebase.

Squashing makes one commit out of many. You should IMO always do this when putting your work on a shared branch

Rebase takes your commit(s) and sticks them on the end.

Merge does something else I don't understand as well, and makes a merge commit.

Also there was an earthquake in NYC when I was writing this. We may have angered the gods.

Lmao I'm in the NYC area and my whole house shook. I'm right there with you. Thanks for the explanation!

You should IMO always do this when putting your work on a shared branch

No. You should never squash as a rule unless your entire team can't be bothered to use git correctly and in that case it's a workaround for that problem, not a generally good policy.

Automatic squashes make it impossible to split commit into logical units of work. It reduces every feature branch into a single commit which is quite stupid.
If you ever needed to look at a list of feature branch changes with one feature branch per line for some reason, the correct tool to use is a first-parent log. In a proper git history, that will show you all the merge commits on the main branch; one per feature branch; as if you had squashed.

Rebase "merges" are similarly stupid: You lose the entire notion of what happened together as a unit of work; what was part of the same feature branch and what wasn't. Merge commits denote the end of a feature branch and together with the merge base you can always determine what was committed as part of which feature branch.

I don't want to see a dozen commits of "ran isort" "forgot to commit this file lol" quality.

Do you?

Having the finished feature bundled into one commit is nice. I wouldn't call it stupid at all.

Note that I didn't say that you should never squash commits. You should do that but with the intention of producing a clearer history, not as a general rule eliminating any possibly useful history.

You squash so all your gross "isort" "forgot to commit this file" "WIP but I'm getting lunch" commits can be cleaned up

The next step on the Git-journey is to use interactive rebasing in order to never push these commits in the first place and maintain a clean history to be consumed by the code reviewer.

Squashing is still nice in order to have a one-to-one relationship between commits on the main branch to pull requests merged, imo.

I'll go one further: use git rebase --interactive

I remember learning about how to use this back in the day and what a game changer it was for my workflow.

Today I like to do all of the commits as I’m working. Maybe dozens or more as I chug along, marking off waypoints rather than logging actual changes. When I’m done a quick interactive rebase cleans up the history to meaningful commits quite nicely.

The fun part is that I will work with people sometimes who both swear that “rewriting history” is evil and should never be done, but also tell me how useful my commit logs are and want to know how I take such good notes as I go.

Argh. I hate that argument.

Yes - "Rewriting history" is a Bad Thing - but o argue that's only on 'main' (or other shared branches). You should (IMHO) absolutely rewrite your local history pre-push for exactly the reasons you state.

If you rewrite main's history and force your changes everybody else is gonna have conflicts. Also - history is important for certain debugging and investigation. Don't be that guy.

Before you push though... rebasing your work to be easily digestible and have a single(ish) focus per commit is so helpful.

  • review is easier since concerns aren't mixed
  • If a commit needs to be reverted it limits the collateral damage
  • history is easier to follow because the commits tell a story

I use a stacked commit tool to help automate rebasing on upstream commits, but you can do it all with git pretty easily.

Anyway. Good on you; Keep the faith; etc etc. :)

The only other time rewriting history might be bad is when you’re working on a shared branch, which is the point of not rewriting main. If you are working solo on a branch, its history is only what you merge into main so it doesn’t fucking matter at all. If you’re not working solo, maybe you need to adopt a similar process or look at how you’re not working solo. The only time I touch another dev’s branch is at the PR stage and only for quick corrections or missing knowledge so it doesn’t matter if they rebased before or honestly rebase after before the final merge.

At my company we just use a squash policy in gitlab. Every merge request becomes a single commit to the main branch. Super easy to read the commit log because all commits are descriptive instead of a bunch of “fix MR comments” or “fix pipeline errors”.

Another advice: git reset [commit-id] followed with a git commit -a is a quick way to squash all your commits.

Another advice ...quick way to squash all your commits

in your IDE select the commits you want to squash. Then rightclick. Then "squash". All done.

I am still mystified by IDE VCS tools. It’s usually faster for me to do a quick CLI shuffle than use the IDE.

I use like 3 of the git-feature from intellij (out of 100 or so). But these 3 features save me a lot of time.

(the other 2 being the 3-way-merge-view and the commit-view where I can select changes for staging)

Even better, master creating fixup and squash commits and maintain logical commits as you work with git rebase -i --autosquash

This is really the only sane way to do it. I have run into some wonkyness with the commit history of the target branch commits not resembling git log, but that's usually for commits outside of what I'm trying to merge.

Edit: squashing commits down this way also helps reduce problems with replaying commit history on the actual rebase. In most cases you don't need all your "microcommits" in the history, and fewer commits just takes less time to reconcile.

Heres my based af workflow:

git checkout -b feature-branch

rebase on top of dev whilst working locally

git rebase origin/dev-branch && git push -f


if i need to fix conflicts with dev-branch during a PR

git merge origin/dev

I have been using something very similar to this. In my team I insisted on people without any git experience working on a separate local branch, than the feature branch

. To ensure screw ups are minimal, we pull and create a local feature branch and then a new local only dummy branch, on top of it. Once the team is more comfortable with git, I am planning to treat the local feature branch as a dummy branch.

So far things have been pretty neat. Spaghetti is no more with minimal conflicts.

Anyone mind explaining to me how git rebase is worth the effort?

git merge has it's own issues but I just don't see any benefit to rebase over it.

I use interactive rebases to clean up the history of messy branches so they can be reviewed commit by commit, with each commit representing one logical unit or type of change.

Mind you, getting those wrong is a quick way to making commits disappear into nothingness. Still useful if you're careful. (Or you can just create a second temporary branch you can fall back onto of you need up your first once.)

This 100%. I hate getting added to a PR for review with testing commits in the history, and I'm expected to clean those up before merging into main.

I feel like squash and merge on GitHub/GitLab is nicer for that anyway though, it makes the main branch so much cleaner automatically

If you're using "trunk-based development" (everything is a PR branch or in main), this works great.

If you're using GitFlow, it can make PRs between the major prod/dev/staging branches super messy. It would be nice if GitHub would let you define which merge strategies are allowed per-branch, but that's not a thing (AFAIK). So you're probably better off not squashing in this situation.

Well, rebase allows you to resolve the same conflict ten times in a row instead of doing it once. How cool is that?

Squash your branch first

Doesn't this defeat the purpose, may as well merge then no?

Do not merge your unfinished stuff into main.

I don't like merging main into my branch because I don't understand git, and I feel like that can make a confusing history.

The way I structure my commits, it is usually (but not always) easier and more reliable for me to replay my commits one at a time on top of the main branch and see how each relatively small change needs to be adapted in isolation--running the full test suite at each step to verify that my changes were correct--than to be presented with a slew of changes all at once that result from marrying all of my changes with all of the changes made to the main branch at once. So I generally start by attempting a rebase and fall back to a merge if that ends up creating more problems than it solves.

Rebase feature branch, merge commit into main (NO SQUASH).

make the commit message be "we’ve fixed bugs and improved performance. to experience the newest features and improvements, checkout the latest version of the branch."

I like rebase, it's everyone else that hate it when I rebase main twice a day.

git rebase is only for terrorists. 🥸

Also for me when I’ve been drinking and committed some really stupid shit into the repo. No one needs to know what I really think of my team members.

Yeah totally merge everything, people like a good spaghetti salad.

i like to create ten different checkouts of main, rebase them all slightly differently and then no fast forward merge them all back into each other

I like the idea of it and there were times i used it correctly, but most of the time i do it wrong i guess.

To produce 1 commit, I end up rebasing the damm thing at least 3 times. If there is an problem, it's at least 2³ times.