gitgit-rev-list

Why might `git rev-list --objects` miss two blobs?


In a large repo, I recently ran the following commands:

git rev-list --objects <my_rev_list> RevListOut.txt

echo -e "<my_rev_list>" | git pack-objects --revs /tmp/XXX
git verify-pack -v /tmp/XXX-b569475c51d937df848abbcfe16433e2f8ebc0f5.pack > Unpack.txt

When I compared the objects in RevListOut.txt and Unpack.txt I discovered that there are exactly two objects in Unpack.txt that do not exist in RevListOut.txt, and they have the following shas:

380a0876f57a4708b4a73a29d2ace2d4506880a2
2000687bd2701ff5c7c37013178d15384f0deefa

I did some more investigation and found that both of these objects corresponds to files that are tracked by git.

Why might git rev-list --objects miss an object?


Solution

  • Checking the configs help got me to

       pack.useSparse
           When true, git will default to using the --sparse option in git
           pack-objects when the --revs option is present. This algorithm only
           walks trees that appear in paths that introduce new objects. This
           can have significant performance benefits when computing a pack to
           send a small change. However, it is possible that extra objects are
           added to the pack-file if the included commits contain certain types
           of direct renames. Default is true.
    

    and notice the last two sentences there.