gitgit-filter-branch

What does refs/original/ do?


Using git for-each-ref --format="%(objectname:short) %(refname)" I get:

207e698 refs/heads/main
f78d212 refs/original/refs/heads/main
24f61e4 refs/original/refs/tags/Cấutrúc1
4248ddd refs/original/refs/tags/Cấutrúc1.1
207e698 refs/remotes/origin/main
bc5335c refs/tags/Cấutrúc2.0
a03c71f refs/tags/Cấutrúc2.0.1
c72d77c refs/tags/Cấutrúc2.1.0
03f6aa0 refs/tags/Cấutrúc2.2.0
391fcd5 refs/tags/cấutrúc2.3.0

What does refs/original/ do? An answer in Remove refs/original/heads/master from git repo after filter-branch --tree-filter? says that:

refs/original/* is there as a backup, in case you mess up your filter-branch. Believe me, it's a really good idea.

From the comment, I get that it is not just a backup for filter-branch, but a way to discourage you from using it. I suppose by "really good idea" the previous quote means that it's a really good idea to not using filter-branch. But is it the only operation triggering its creation? And I see it has various subpaths: refs/heads, refs/tags, heads. What are they trying to say?


Solution

  • Li'l sermon up front: Okay, there's a famous quote that talks about apprehending the confusion of ideas that could produce such a question that's relevant here, if you'd explained what prompted this question it'd have been much easier to figure out what's driving this. It's likely enough pure luck something about this didn't smell to me like idle curiosity or some misbegotten hunt, so I did some digging. Really: when you're asking multiple questions prompted by the same situation (and that's not a bad thing at all), link them all and explain the situation.

    You did some history rewriting including a filter-branch, then it looks like you did a rebase or just abandoned a branch, this sequence will produce the symptom that prompted your other question I linked:

    $ sh <<EOD
    cd `mktemp -d`; git init --template= -b example
    echo >file; git add .; git commit -mA
    git checkout -b deadfeature
    echo >>file; git commit -amZ
    git checkout -
    echo >>file; git commit -amB 
    sleep 1 # to give the C commit a clearly later timestamp:
    echo >>file; git commit -amC
    
    FILTER_BRANCH_SQUELCH_WARNING=1 \
            git filter-branch --env-filter 'export GIT_AUTHOR_NAME="Kilroy"' -- --all
    
    git branch -D deadfeature
    
    git log --graph --oneline --all
    EOD
    […setup chatter…]
    * 61ac5bc (HEAD -> example) C
    * a90a3b2 B
    * f62d81a A
    * fc6f2d4 C
    * 090789e B
    | * 7e8c2d0 Z
    |/
    * 79c4fde A
    $
    

    and that log output is confusing. I'll go so far as to call it a misfeature of git log, one that has gone unfixed perhaps because it's so rarely encountered, so easily overcome, and the easy fixes have their own problems.

    $ git log --graph --oneline --all --boundary --min-parents=1 --decorate-refs=*
    * cf2fd07 (example) C
    * 62b4fb9 B
    | * 3b18524 (refs/original/refs/heads/example) C
    | * 0598ba4 B
    | | * 9c91f5c (refs/original/refs/heads/deadfeature) Z
    | |/
    | o e5d9cec A
    o a79a75e A
    $
    

    shows details git log ordinarily conceals. Adding decorations for administrative refs can easily get very annoying, here it's sensible, needed even, but that's far from always the case; and marking root commits with the "exclusion boundary marking" feature has some even worse interactions with other git log options, I can find a lot of sympathy for the view I imagine held the day: best not to open that can of worms and just show what's going on when people get confused.

    So refs/original/ is the ref prefix git filter-branch uses to remember what history looked like before it rewrote it. If you were lazy and did it in a repo where damage could be expensive to fix, git fetch -u . +refs/original/*:* will restore every ref it rewrote to what it was before the filter-branch, your history's back where it came from, any work tree and index changes since the filter branch remain and you can do whatever's right for you with those.

    Take some time and stare at what's going on here, run the examples here. Git does not remove history from the repository until it's gone unreferenced for quite a while, two weeks to a month depending on how exactly it was abandoned is the factory default. Filter-branch can make massive changes. The refs/original pace makes backout easy in the short term and possible until you explicitly decide you really won't be needing it. git log's default display simplifications hide things that would just get in the way, displaying them with its existing facilities would be confusing or even error-prone. You hit the main downsides of its usual heuristics, lucky you.