gitgit-rebaseassembla

Undo Git Rebase


I performed a git rebase master on my branch and didn't realize it wasn't what I wanted until after I pushed it to the remote. It's only myself and 1 other person working on the project, so I know they've not pulled the latest changes.

Reading other questions on StackOverflow, it said to use git reflog and then git reset --hard HEAD@{n} to before the rebase. I did this to go to a commit I created before the rebase, but it didn't restore things to how it was before.

Am I missing a step? Is there a way to have the other person force-push his repo back up to restore things to how they were?

Thanks


Solution

  • As Makoto already noted, you probably shouldn't bother undoing this rebase: it's probably what you wanted. Nonetheless, feel free to read on for how to undo it.


    Use the reflog for the branch, as it will be easier to read. (The HEAD reflog has the same information, but has a whole lot more stuff in it, hence it's harder to find what you're looking for.)

    For instance if I had just rebased mybranch, I would see:

    $ git reflog mybranch
    nnnnnnn mybranch@{0}: rebase finished: refs/heads/mybranch onto biguglysha1
    ooooooo mybranch@{1}: commit: some sort of commit message
    ...
    

    The name mybranch@{1} is therefore synonymous (at the moment) with ooooooo, the old abbreviated SHA-1. Every time you do something to the branch (such as git reset) the number inside the @{...} part will change, while the SHA-1s are forever permanent, so it's a bit safer to use the SHA-1 (full or abbreviated) for cut-and-paste.

    If you then:

    $ git checkout mybranch # if needed
    

    and:

    $ git reset --hard ooooooo  # or mybranch@{1}
    

    you should have the original back. This is because rebase simply copies commits and then moves the label. After the rebase, but before the reset, the commit graph looks something like this, where A through C are "your" commits:

              A - B - C              <-- (only in reflog now)
            /
    ... - o - o - o - A' - B' - C'   <-- mybranch (after rebase)
    

    and git reset simply1 erases the current branch label and pastes it on to the supplied SHA-1 (turning a reflog name into an SHA-1 first if needed). Hence, after the reset:

              A - B - C              <-- mybranch, plus older reflog
            /
    ... - o - o - o - A' - B' - C'   <-- (only in reflog now)
    

    Note that now, post-reset, the rebase-made commit copies are the "abandoned" ones that are only found in reflog entries. The originals, which had been abandoned, are now claimed under mybranch again.

    The way to think about this stuff is to draw the commit graph (with new commits pointing back at their parent commits), then draw in branch labels with long arrows pointing to the commit graph. The graph never2 changes except to add new commits, which have new and different big ugly SHA-1s (which is why I use letters like A B and C instead, and tack on additives like A' for copies). The SHA-1s are guaranteed to be unique3 and are permanent, but the labels with their long arrows, get erased and re-pointed all the time. (If you do this on a whiteboard, you should generally use black for the commit graph and a color, or several colors, for the labels.)


    1Well, git reset does more than just move the label, unless you add some command-line flags. By default, it moves the label and resets the index; with --hard, it moves the label and resets the index and cleans out your work-tree. With --soft it just moves the label, leaving the index and work-tree alone. With git being what it is, there are a bunch more flags that twist up the meaning even further, but those are the big three: --soft, nothing aka --mixed, and --hard.

    2If git only ever added things, your repo would grow huge over time. So, eventually, "unreachable" commits—those with no labels, and not even any leftover reflog entries, pointing to them, and that are not pointed-to by some commit that does have a label, or some other pointed-to commit—eventually these unreachable commits (and any other unreachable objects) are removed when git runs git gc for you automatically. You can force them to be removed earlier, but there's rarely a good reason to bother.

    3Git itself depends on the guarantee. It's mathematically possible, but extremely improbable, for any two different objects to wind up with the same SHA-1. If this happens, git breaks.4 If the distribution is good enough the probability is 1 out of 2160, which is really tiny. This is a good thing as the "Birthday paradox" raises the possibility pretty rapidly, but because it started so tiny, it stays tiny and in practice it's never been a problem.

    4By design, the "breakage" is that git simply stops adding objects, so that everything-so-far is still good. You then have to move to whatever new system has been devised to handle billions-of-objects repositories, the "next generation" git or whatever.