I found many interesting posts about git fsck
, so I wanted to experiment a little on them. First of all the sources I read before this question:
How can I find an unreachable commit hash in a GIT repository by keywords?
git fsck: how --dangling vs. --unreachable vs. --lost-found differ?
I started with this repo:
* 9c7d1ea (HEAD -> test) f
* cd28884 e
| * 7b7bac0 (master) d
| * cab074f c
|/
* d35af2c b
| * f907f39 r # unreferenced commit
|/
* 81d6675 a
Where r
has been created from a detached HEAD
from a
.
Then I wanted to rebase master
on test
, but I had some unstaged changes, so I did:
git rebase --autostash test
Obtaining (I am not showing r
but it is still there):
* caee68c (HEAD -> master) d
* 2e1cb7d c
* 9c7d1ea (test) f
* cd28884 e
* d35af2c b
* 81d6675 a
Next I run:
$ git fsck
#...
dangling commit 6387b70fe14f1ecb90e650faba5270128694613d # stash
#...
$ git fsck --unreachable
#...
unreachable commit 6387b70fe14f1ecb90e650faba5270128694613d # stash
unreachable commit d8bb677ce0f6602f4ccad46123ee50f2bf6b5819 # stash index
#...
$ git fsck --lost-found
#...
dangling commit 6387b70fe14f1ecb90e650faba5270128694613d # stash
dangling commit f907f39d41763accf6d64f4c736642c0120d5ae2 # r
#...
Why does only the --lost-found
version return the r
commit? And why are not the c
and d
before the rebase
shown among the unreachables? I thought I understood the difference reading the linked questions, but I am clearly missing something. I still have the complete reflog, but I guess you do not need it, since all commits (except those related to the stash
) are referenced.
I know I should create another post but the second question is partially related. I tried out of curiosity:
$ git fsck --lost-found --unreachable
#...
unreachable commit 6387b70fe14f1ecb90e650faba5270128694613d # stash
unreachable commit d8bb677ce0f6602f4ccad46123ee50f2bf6b5819 # stash index
unreachable commit f907f39d41763accf6d64f4c736642c0120d5ae2 # r
unreachable commit 7b7bac0608936a0bcc29267f68091de3466de1cf # c before rebase
unreachable commit cab074f2c9d63919c3fa59a2dd63ec874b0f0891 # d before rebase
#...
Combining both options I get all the unreachable commits (and not just the union of --lost-found
and --unreachable
), this is very unexpected. Why does it behave like this?
Some of this is indeed puzzling, and appears not to be properly documented, but a quick look at builtin/fsck.c shows that using --lost-found
:
--full
;--no-reflogs
.Item 1 isn't particularly interesting since --full
is now on by default anyway, but the documentation really should call out that --lost-found
disables --no-full
. Item 2 explains most of the rest; I have a guess at the last part [Edit: the rest].
Note that when you ran:
git checkout master && git rebase --autostash test
this made Git run git stash push
, which made a new stash consisting of two new commits. Git then did the rebase as usual, which copied the cab074f
and 7b7bac0
commits, visible in the original git log --all --decorate --oneline --graph
output, to the new 2e1cb7d
and caee68c
commits visible in the second output.
Why does only the
--lost-found
version return ther
commit? And why are not thec
andd
before the rebase shown among the unreachables?
Presumably that commit is still in the HEAD
reflog. That makes it reachable from a reference—but since --lost-found
implies --no-reflogs
, it becomes unreachable this time. The same goes for the originals of c
and d
: they're reachable via multiple reflog entries, from both HEAD
's reflog and master
's.
Combining both options I get all the unreachable commits (and not just the union of
--lost-found
and--unreachable
), this is very unexpected. Why does it behave like this?
That's more puzzling. [Edit: solved; see below.] Let's run these in order of your git fsck
commands:
fsck 1 and fsck 2: Both discover the autostash commits. That's because git stash push
copied the original refs/stash
to the stash reflog, so that refs/stash
could point to the autostash w
(working tree) commit. Then the implied git stash apply && git stash drop
(git stash pop
) applied the stash and dropped it, moving the stash@{1}
entry back to refs/stash
and deleting the stash reflog. So the w
commit from the autostash is truly "dangling". It's not in refs/stash
and it's not even in the stash
reflog, because git stash
(ab)uses this reflog as the "stash stack". It does, however, point to the i
commit from the autostash.
The first fsck, then, prints 6387b70fe14f1ecb90e650faba5270128694613d
and calls it "dangling". That's the w
commit that was dropped. The second fsck
, with --unreachable
, adds d8bb677ce0f6602f4ccad46123ee50f2bf6b5819
: the corresponding i
commit that was dropped.
fsck 3: The r
and rebased commits remained invisible under git fsck --unreachable
because they're referenced from the reflogs. But now, with --lost-found
, fsck does not look at the reflogs. We should expect to see the autostash w
commit, the r
commit, and the pre-rebase d
, all as dangling. [Edit: as per comment, this is wrong: w
links back to i
and d
, so this will hide d
.]
We actually see the w
and r
commits but not the . d
commitWhy not? This is my guess but it's easy to test if you still have the setup around: when you use git rebase
successfully, Git creates or updates the pseudo-ref named ORIG_HEAD
to remember the hash ID of the tip commit before the rebase completes. Note that this same name is used to remember the previous value of a ref after a successful git reset
that moves one, and after any other operation that might move a branch name some distance (fast-forward merge, for instance).
It's pretty obvious that git fsck
must consider all of the various *_HEAD
pseudo-refs as starting points for reachability. This, too, is not documented (and it's not even completely clear it's intentional here—the ref code has been under some fairly heavy rework lately, to support alternative ref backends).
fsck 4, just before your SECOND QUESTION section: either [edit] Since --unreachable
turned off the pseudoref inclusion, or—I think this is more likely—you did something in between that touched ORIG_HEAD
so that it no longer selected the original, pre-rebase d
commit.--unreachable
lists all unreachable commits, the fact that d
is reachable indirectly from the autostash w
commit is irrelevant, and we see everything.
If you would like to report a Git documentation bug, that the fsck documentation does not note that --lost-found
implies --no-reflogs
, you should do that.