gitversion-controlgarbage-collectiongit-gc

Git: When does Git perform garbage collection?


I was wondering: when does Git perform its garbage collection? I know that in the past one had to invoke git gc to manually start the garbage collection, but now it is done automatically, when?

Also, is there a need to invoke it manually in the latest Git versions?


Solution

  • Much of the answer is in the git gc documentation:

    --auto

    With this option, git gc checks whether any housekeeping is required; if not, it exits without performing any work. Some git commands run git gc --auto after performing operations that could create many loose objects.

    Housekeeping is required if there are too many loose objects or too many packs in the repository. If the number of loose objects exceeds the value of the gc.auto configuration variable, then all loose objects are combined into a single pack using git repack -d -l. Setting the value of gc.auto to 0 disables automatic packing of loose objects.

    If the number of packs exceeds the value of gc.autopacklimit, then existing packs (except those marked with a .keep file) are consolidated into a single pack by using the -A option of git repack. Setting gc.autopacklimit to 0 disables automatic consolidation of packs.

    The only thing missing here is an explanation of which "some" commands might run git gc --auto, and when. This list is subject to change, but looking at current git source, these stand out:

    git fetch
    git merge
    git receive-pack
    git am
    git rebase
    

    (this is from git grep -e --auto -- '*.c' '*.sh' and eyeball-excluding all the t/ tests scripts and other obvious false hits). If you want something more in depth, the source to git is on github.com...

    Note: with Git 2.17 (Q2 2018), you need to consider also:

    git commit