I've been pair programming with a colleague, I know he squashed all the commits on our branch (that's how we do in our team, before merging). However, I need to modify the branch.
This context is irrelevant to the questions at hand. Problems will happen, and I'll have to use tools to make sure I get things right. Please do not make your whole response about this.
When I check it out, "git status" returns:
your branch and origin have diverged, and have x and y different commit(s) each, respectively. (use "git pull" if you want to integrate the remote branch with yours)
I've never found this message to be helpful (especially the "git pull" part). Though I know what's the problem here: the history was rewritten with git rebase
and squash
and git push --force
.
But how would I know in the future if that's the case? How do I check why and how they have diverged? There could be multiple reason this happened, is it possible to easily check them all?
Just to answer this part:
But how would I know in the future if that's the case?
Check:
git reflog <remote-tracking branch>
Maybe:
git reflog origin/feature
And you will see:
4a58a56ba42 refs/remotes/origin/feature@{3}: fetch origin: forced-update
That’s the smoking gun. The reflog will force-update the remote-tracking branch. And it will record it.
(core.logAllRefUpdates
set to true
will record this which is the
default in non-bare repositories.)
You seem to have a good understanding of the underlying problem. But I’ll discuss the problem with rewriting history for the wider audience.
Keep in mind what Git does for you automatically when doing anything
with it. This goes for solo work (like working a feature branch) but is
especially important when working closely with others. Git works with
the graph that you create effortlessly. Did you create five commits on
top of main
? Git knows. Did you create two more commits and then
three more commits on main
? Git knows as well. It will report it
(ahead-behind) and it will deal with merging as well. What you did and
what main
has done in the meantime is clear.
Now say that you are working solo on your branch that you have never
shared. You can use git-rebase(1) to keep up to date
with main
:
git checkout feature
git rebase main
Without merge conflicts it is straightforward: all of your commits are
now on top of main
. [1] If you get confused about what happened?
Use git reflog <your branch>
and look at each rebase operation (like
pick
) or use that to go back to the previous state using that reflog.
Or use ORIG_HEAD
. [2]
Also using git rebase --interactive
is fine as long as you make sure
to check that your new state (after the rebase operation) is what you
expected.
Now say that you are working with someone else on the same branch.
Using git-rebase(1) becomes problematic because it rewrites history.
You already did that when working alone. But that could only confuse
yourself. Which is not a risk if are careful and make sure that you get
what you expect from each operation like git rebase main
. [3]
But with other people you can’t just rewrite history. And that’s not
because (or assuming) that they don’t know how rewriting history works.
Maybe they are an expert in it. But except for out-of-band
communication you can’t know when they will pull your changes.
Since this is about squashing: say that you have five commits on this branch and you decide to squash them since you think it’s just noise. You push that and they eventually pull. Now we’re back to working with what Git is good at: Git is good at working with your history graph. In the sense of making commits on top of the previous state. It’s not good at reporting rewritten graphs. Now it will tell your coworker that things are different. [4] It sees that the branch does not come after what you have. And it gives up. That’s it. Now your coworker can only guess. She can make some guesses. Let’s say she guesses correctly. From her perspective:
main
and my local branch on the last commit that I mademain
and the remote-tracking branch (your changes, the
rewritten branch)All those steps are manual since git(1) doesn’t give you anything (that I know of) to detect it out of the box.
And that was the optimistic case:
main
hadn’t moved: you should probably have used git merge-base main feature
just in caseDon’t rewrite anyting that the other person has seen. Just merge incoming merges if you need to. You can do rewrite operations on things that only you have seen though. Like rebasing the three commits that you have made after she has updated the remote.
Once you are done you can still rewrite the branch if you want. Eventually you are done. Then one of you can take over and rewrite the branch to get rid of all of the noise. Or even to make more commits by splitting them.
Notice the hand-off: once you take over the branch you are back to the solo case where rewriting is just fine.
Someone that rewrites history should tell everyone they collaborate with. They might make mistakes. They might not know the problems it causes. But in any case the real solution is to teach them so it doesn’t happen again. That’s preferable to hunkering down and trying to divine what they did after the fact.
git(1) does not support this in the general case. Keep in mind that rewriting history is very general. We might only be talking about squashing here. But it might not be intuitive to people that doing something slightly different than just one single strategy (like squashing) can make the discovery procedure fall apart.
Discovering that N commits have been squashed is feasible. Without conflicts. With conflicts it is impossible. [5] Then generalize to e.g. squashing N-2 commits but keeping those two last comits. That should still be doable. That still sound doable. But I would say that it is a complex enough problem to warrant a robust script or program. For those cases that are tractable.
Keep in mind that you will get no feedback if you make a rewrite that is “intractable” to discover for the other party. Again I reiterate the value of sticking to the happy path for Git: a graph which is built upon without rewriting anything.
Now. Git does have built-in tools to check whether a commit is patch-identical (they introduce the same change). These are:
Also git-rebase(1) can detect patch-identical commits. [1]
But it might be relevant to keep in mind that git-cherry(1) is not meant for branch collaboration workflows. It is meant for the workflow where you submit patches as emails to an upstream (gitworkflows(7)). There you have left the Git DB and therefore cannot track your commit in the upstream since it will be applied as a new commit. So in my opinion you shouldn’t use these tools when working on the same branch (in the same Git DB).
main
. And git-rebase(1) will then
skip those commits that you made. But that is not going to happen if
you are working on something completely new.ORIG_HEAD
. (I think this stands for “original HEAD
”. I used to
think it was “origin
HEAD
but it has got nothing to do with the
conventional origin
remote.)