gitgit-gc

Do I ever need to run git gc on a bare repo?


man git-gc doesn't have an obvious answer in it, and I haven't had any luck with Google either (although I might have just been using the wrong search terms).

I understand that you should occasionally run git gc on a local repository to prune dangling objects and compress history, among other things -- but is a shared bare repository susceptible to these same issues?

If it matters, our workflow is multiple developers pulling from and pushing to a bare repository on a shared network drive. The "central" repository was created with git init --bare --shared.


Solution

  • As Jefromi commented on Dan's answer, git gc should be called automatically called during "normal" use of a bare repository.

    I just ran git gc --aggressive on two bare, shared repositories that have been actively used; one with about 38 commits the past 3-4 weeks, and the other with about 488 commits over roughly 3 months. Nobody has manually run git gc on either repository.

    Smaller repository

    $ git count-objects
    333 objects, 595 kilobytes
    
    $ git count-objects -v
    count: 333
    size: 595
    in-pack: 0
    packs: 0
    size-pack: 0
    prune-packable: 0
    garbage: 0
    
    $ git gc --aggressive
    Counting objects: 325, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (323/323), done.
    Writing objects: 100% (325/325), done.
    Total 325 (delta 209), reused 0 (delta 0)
    Removing duplicate objects: 100% (256/256), done.
    
    $ git count-objects -v
    count: 8
    size: 6
    in-pack: 325
    packs: 1
    size-pack: 324
    prune-packable: 0
    garbage: 0
    
    $ git count-objects
    8 objects, 6 kilobytes
    

    Larger repository

    $ git count-objects
    4315 objects, 11483 kilobytes
    
    $ git count-objects -v
    count: 4315
    size: 11483
    in-pack: 9778
    packs: 20
    size-pack: 15726
    prune-packable: 1395
    garbage: 0
    
    $ git gc --aggressive
    Counting objects: 8548, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (8468/8468), done.
    Writing objects: 100% (8548/8548), done.
    Total 8548 (delta 7007), reused 0 (delta 0)
    Removing duplicate objects: 100% (256/256), done.
    
    $ git count-objects -v
    count: 0
    size: 0
    in-pack: 8548
    packs: 1
    size-pack: 8937
    prune-packable: 0
    garbage: 0
    
    $ git count-objects
    0 objects, 0 kilobytes
    

    I wish I had thought of it before I gced these two repositories, but I should have run git gc without the --aggressive option to see the difference. Luckily I have a medium-sized active repository left to test (164 commits over nearly 2 months).

    $ git count-objects -v
    count: 1279
    size: 1574
    in-pack: 2078
    packs: 6
    size-pack: 2080
    prune-packable: 607
    garbage: 0
    
    $ git gc
    Counting objects: 1772, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (1073/1073), done.
    Writing objects: 100% (1772/1772), done.
    Total 1772 (delta 1210), reused 1050 (delta 669)
    Removing duplicate objects: 100% (256/256), done.
    
    $ git count-objects -v
    count: 0
    size: 0
    in-pack: 1772
    packs: 1
    size-pack: 1092
    prune-packable: 0
    garbage: 0
    
    $ git gc --aggressive
    Counting objects: 1772, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (1742/1742), done.
    Writing objects: 100% (1772/1772), done.
    Total 1772 (delta 1249), reused 0 (delta 0)
    
    $ git count-objects -v
    count: 0
    size: 0
    in-pack: 1772
    packs: 1
    size-pack: 1058
    prune-packable: 0
    garbage: 0
    

    Running git gc clearly made a large dent in count-objects, even though we regularly push to and fetch from this repository. But upon reading the manpage for git config, I noticed that the default loose object limit is 6700, which we apparently had not yet reached.

    So it appears that the conclusion is no, you don't need to run git gc manually on a bare repo;* but with the default setting for gc.auto, it might be a long time before garbage collection occurs automatically.


    * Generally, you shouldn't need to run git gc. But sometimes you might be strapped for space and you should run git gc manually or set gc.auto to a lower value. My case for the question was simple curiosity, though.