I accidentally dropped a DVD-rip into a website project, carelessly git commit -a -m ...
, and, zap, the repository was bloated by 2.2 GB. Next time I made some edits, deleted the video file, and committed everything, but the compressed file was still there in the repository, in history.
I know I can start branches from those commits and rebase one branch onto another. But what should I do to merge the two commits, so that the big file doesn't show in the history and is cleaned in the garbage collection procedure?
Do not use:
git filter-branch
This command might not change the remote repository after pushing. If you clone after using it, you will see that nothing has changed and the repository still has a large size. It seems this command is old now. For example, if you use the steps in https://github.com/18F/C2/issues/439, this won't work.
The Solution
This solution is based on using:
git filter-repo
(1) Find the largest files in .git (change 10 to whatever number of files you want to display):
git rev-list --objects --all | grep -f <(git verify-pack -v .git/objects/pack/*.idx| sort -k 3 -n | cut -f 1 -d " " | tail -10)
(2) Start filtering these large files by passing the path&name of the file you would like to remove:
git filter-repo --path-glob '../../src/../..' --invert-paths --force
Or use the extension of the file, e.g., to filter all .zip files:
git filter-repo --path-glob '*.zip' --invert-paths --force
Or, e.g., to filter all .a library files:
git filter-repo --path-glob '*.a' --invert-paths --force
or whatever you find in step 1.
git remote add origin git@github.com:.../...git
git push --all --force
git push --tags --force