We face currently the strange situation, that a repository that is as local clone only 65MB is on the server (GitBlit, but that should not matter) 12 GB in size. I have tried different ideas what could go wrong here, here is the list:
git ls-tree -r -t -l --full-name HEAD > stats.txt
for each branch on the server, and collected that information.cut -c53-60 <filename> | grep -v '-' | awk '{ sum += $1 } END { print sum }'
do summarize all file sizes of all commits.So we didn't found any commit with big files in it.
My local directory .git/objects/pack
has a pack file with currently 17MB (after a GC, before it was 21MB).
The pack files on the server are currently 12 GB in size.
I have cloned the repository in the normal way: git clone https://myserver.mycompancy.com/gitblit/r/projectID/projectID.git
and got a local copy. To be sure, I have done then git fetch --all
without a change.
So what can we do to find the reason why the pack files on the server are much bigger? GitBlit has an automatic GC running that will pack loose objects older than 7 days.
Update: I have done as recommended the command git verify-pack -v
on both my local clone and the server, and here are the results (only as statistic):
So the pack file on the server is a magnitude (~ 270 times) longer which explains alone the difference in the pack. What should be the next steps to find the reason for that many more lines? Is some aspect of the statistic more interesting?
See my ticket on GitHub about the problem. Here is a summary what we have done:
git verify-pack -v
(thanks to @max360).git gc --prune --agressive
, the former 12 GB pack file was shrunken to ~ 110 MB in size.We have no idea what went wrong so that the repository was bloated, but at least we found a way to shrink it again.
@James Moger explained in the GitHub ticket that doing a GC on GitBlit is an experimental feature, and because JGit is used instead of the Git binary, the result of a GC done by GitBlit may be different to one by the git gc
command above.