In a large repo, I recently ran the following commands:
git rev-list --objects <my_rev_list> RevListOut.txt
echo -e "<my_rev_list>" | git pack-objects --revs /tmp/XXX
git verify-pack -v /tmp/XXX-b569475c51d937df848abbcfe16433e2f8ebc0f5.pack > Unpack.txt
When I compared the objects in RevListOut.txt
and Unpack.txt
I discovered that there are exactly two objects in Unpack.txt
that do not exist in RevListOut.txt
, and they have the following shas:
380a0876f57a4708b4a73a29d2ace2d4506880a2
2000687bd2701ff5c7c37013178d15384f0deefa
I did some more investigation and found that both of these objects corresponds to files that are tracked by git.
Why might git rev-list --objects
miss an object?
Checking the configs help got me to
pack.useSparse
When true, git will default to using the --sparse option in git
pack-objects when the --revs option is present. This algorithm only
walks trees that appear in paths that introduce new objects. This
can have significant performance benefits when computing a pack to
send a small change. However, it is possible that extra objects are
added to the pack-file if the included commits contain certain types
of direct renames. Default is true.
and notice the last two sentences there.