gitgit-squash

How to see if remote branch was squashed


I've been pair programming with a colleague, I know he squashed all the commits on our branch (that's how we do in our team, before merging). However, I need to modify the branch.

This context is irrelevant to the questions at hand. Problems will happen, and I'll have to use tools to make sure I get things right. Please do not make your whole response about this.

When I check it out, "git status" returns:

your branch and origin have diverged, and have x and y different commit(s) each, respectively. (use "git pull" if you want to integrate the remote branch with yours)

I've never found this message to be helpful (especially the "git pull" part). Though I know what's the problem here: the history was rewritten with git rebase and squash and git push --force.

But how would I know in the future if that's the case? How do I check why and how they have diverged? There could be multiple reason this happened, is it possible to easily check them all?


Solution

  • Just to answer this part:

    But how would I know in the future if that's the case?

    Check:

    git reflog <remote-tracking branch>
    

    Maybe:

    git reflog origin/feature
    

    And you will see:

    4a58a56ba42 refs/remotes/origin/feature@{3}: fetch origin: forced-update
    

    That’s the smoking gun. The reflog will force-update the remote-tracking branch. And it will record it.

    (core.logAllRefUpdates set to true will record this which is the default in non-bare repositories.)


    You seem to have a good understanding of the underlying problem. But I’ll discuss the problem with rewriting history for the wider audience.

    Keep in mind what Git does for you automatically when doing anything with it. This goes for solo work (like working a feature branch) but is especially important when working closely with others. Git works with the graph that you create effortlessly. Did you create five commits on top of main? Git knows. Did you create two more commits and then three more commits on main? Git knows as well. It will report it (ahead-behind) and it will deal with merging as well. What you did and what main has done in the meantime is clear.

    Now say that you are working solo on your branch that you have never shared. You can use git-rebase(1) to keep up to date with main:

    git checkout feature
    git rebase main
    

    Without merge conflicts it is straightforward: all of your commits are now on top of main. [1] If you get confused about what happened? Use git reflog <your branch> and look at each rebase operation (like pick) or use that to go back to the previous state using that reflog. Or use ORIG_HEAD. [2]

    Also using git rebase --interactive is fine as long as you make sure to check that your new state (after the rebase operation) is what you expected.

    Now say that you are working with someone else on the same branch. Using git-rebase(1) becomes problematic because it rewrites history. You already did that when working alone. But that could only confuse yourself. Which is not a risk if are careful and make sure that you get what you expect from each operation like git rebase main. [3] But with other people you can’t just rewrite history. And that’s not because (or assuming) that they don’t know how rewriting history works. Maybe they are an expert in it. But except for out-of-band communication you can’t know when they will pull your changes.

    Since this is about squashing: say that you have five commits on this branch and you decide to squash them since you think it’s just noise. You push that and they eventually pull. Now we’re back to working with what Git is good at: Git is good at working with your history graph. In the sense of making commits on top of the previous state. It’s not good at reporting rewritten graphs. Now it will tell your coworker that things are different. [4] It sees that the branch does not come after what you have. And it gives up. That’s it. Now your coworker can only guess. She can make some guesses. Let’s say she guesses correctly. From her perspective:

    1. Maybe he squashed all commits?
    2. Okay, let’s compare these two diffs:
      • Diff main and my local branch on the last commit that I made
      • Diff main and the remote-tracking branch (your changes, the rewritten branch)
      • Oh, they are equal! He did squash it.
    3. Okay, I’ll just hard-reset to the remote-tracking branch
    4. But I also have three more commits since he made that squash. I’ll just cherry-pick or rebase those on top of that hard-reset.

    All those steps are manual since git(1) doesn’t give you anything (that I know of) to detect it out of the box.

    And that was the optimistic case:

    1. She guessed what you did correctly
    2. You hadn’t made more commits on top of the squash, or even made commits on your end (not hers) which you squashed into the single commit as well
    3. main hadn’t moved: you should probably have used git merge-base main feature just in case

    How to work with other people on a shared branch

    Don’t rewrite anyting that the other person has seen. Just merge incoming merges if you need to. You can do rewrite operations on things that only you have seen though. Like rebasing the three commits that you have made after she has updated the remote.

    You can still rewrite the history when you are done

    Once you are done you can still rewrite the branch if you want. Eventually you are done. Then one of you can take over and rewrite the branch to get rid of all of the noise. Or even to make more commits by splitting them.

    Notice the hand-off: once you take over the branch you are back to the solo case where rewriting is just fine.

    Out-of-band information: tell them

    Someone that rewrites history should tell everyone they collaborate with. They might make mistakes. They might not know the problems it causes. But in any case the real solution is to teach them so it doesn’t happen again. That’s preferable to hunkering down and trying to divine what they did after the fact.

    Truly discovering rewritten history after the fact

    git(1) does not support this in the general case. Keep in mind that rewriting history is very general. We might only be talking about squashing here. But it might not be intuitive to people that doing something slightly different than just one single strategy (like squashing) can make the discovery procedure fall apart.

    Discovering that N commits have been squashed is feasible. Without conflicts. With conflicts it is impossible. [5] Then generalize to e.g. squashing N-2 commits but keeping those two last comits. That should still be doable. That still sound doable. But I would say that it is a complex enough problem to warrant a robust script or program. For those cases that are tractable.

    Keep in mind that you will get no feedback if you make a rewrite that is “intractable” to discover for the other party. Again I reiterate the value of sticking to the happy path for Git: a graph which is built upon without rewriting anything.

    Now. Git does have built-in tools to check whether a commit is patch-identical (they introduce the same change). These are:

    Also git-rebase(1) can detect patch-identical commits. [1]

    But it might be relevant to keep in mind that git-cherry(1) is not meant for branch collaboration workflows. It is meant for the workflow where you submit patches as emails to an upstream (gitworkflows(7)). There you have left the Git DB and therefore cannot track your commit in the upstream since it will be applied as a new commit. So in my opinion you shouldn’t use these tools when working on the same branch (in the same Git DB).

    Notes

    1. Technically you could have commits which are patch-identical to something that has landed in main. And git-rebase(1) will then skip those commits that you made. But that is not going to happen if you are working on something completely new.
    2. Assuming that you have not used any other operations that change ORIG_HEAD. (I think this stands for “original HEAD”. I used to think it was “origin HEAD but it has got nothing to do with the conventional origin remote.)
    3. But like with anything: you have to do it in order to get comfortable with it, to learn and defeat confusion.
    4. The question here is about what someone else did. But it’s simpler for the narrative to discuss what “you” did. All hypothetical anyway.
    5. Programmatically impossible: I am not talking about heuristics like comparing both the changes (diff) and the subject.