gitlabbfg-repo-cleanergit-filter-repo

Delete objects from Gitlab repository within a certain date range


I have to remove a number of files from a git repo on gitlab.com that are tracked with Git LFS. They were automatically updated in a nightly CI build over a number of weeks and I quickly ran out of storage space.

I have followed the original documentation on cleaning up a gitlab repo, but found it to not lower the used storage space. I probably did something wrong in my attempt to not delete this file across the entire history, but just within a certain date range.


Solution

  • The following steps using the bfg repo cleaner instead of git-filter-repo ended up working:

    (Note: git rev-list is your friend! This was the main thing that helped me filter out exactly which objects to delete from history. It's worth looking at its options!)

    1. Follow the original docs until step 6.
    1. Download the bfg repo cleaner .jar file from here
    2. Find the object IDs for the file path to delete in certain parts of the history. The easiest way for me was to simply filter by dates (replacing <YYYY-MM-DD> with the start/end dates between which to scan the git repository for the file marked as <file path in repo>):
      git rev-list --all --objects --since <YYYY-MM-DD> --before <YYYY-MM-DD> | grep <file path in repo> > object_ids_to_delete.txt
      
    3. The file will contain a list of lines like <Object ID> <file path>, so just remove all instances of <file path>
    4. Run bfg on the repo with the given set of object IDs: java -jar bfg.jar -bi ./object_ids_to_delete.txt ./project.git
    5. Go to the repo folder: cd ./project.git
    6. Update the reflog: git reflog expire --expire=now --all && git gc --prune=now --aggressive (see bfg docs)
    7. Continue with the remaining steps from the gitlab docs:
    1. Wait 30 minutes.
    2. Locate the commit map file that bfg generates: it should be in ./project.git.bfg-report/<date>/<timestamp>/object-id-map.old-new.txt
    3. Upload it in the "Repository Cleanup" part of the project settings, as described in the next step in the gitlab docs
    4. Wait some more...
    5. Success!