gitgit-filter-branchbfg-repo-cleaner

Split Git Repository and only keep history of remaining files


I got a git repository containing 11 different and independent projects (don't ask me why the **** they are all in one repository). Because some of the projects containing many assets, gitlab says that the size of the repo is about 14.3 GB and that causes huge checkout times (on our CI/CD system up to 20 minutes).

Because we only build one of the projects at a time, I want to separate all projects to different repositories. Because Project A does not need commits related to files of Project B, I want to cleanup the whole history.

I already tried different ways:

  1. Deleting the files. The files are gone, but still available via history.
  2. Using a simple git filter-branch --prune-empty, but I want to keep the file structure.
  3. Using git filter-branch --index-filter --prune-empty with git rm --cached --ignore-unmatch, but I can still recover old files.
  4. Deleting the files and using Git BFG with --delete-folders. Great result, but I can only provide a glob/regex and some Projects contaiing folders with the name of other projects (bad naming...) which are also wiped out...

The best would be a tool/command working like BFG, but which allows me to provide paths to delete or better paths to keep.

Example of the file structure:

./
+- Project A/
+- Project B/
+- UI Projects/
|  +- Foo/
|  +- Bar/
+- Project E/
|  +- Foo/
|     +- Bar/
+- Build
   +- build_a/
   +- build_b/
   +- build_foo/
   +- build_bar/
   +- build_e/

My requierments are:

Any suggestions?


Solution

  • The following tree-filter satisfies your requirements:

    find . ./Build -maxdepth 1 -path . -o -path ./Build -o -path "./Project A" -o -path ./Build/build_a -o -exec rm -rf {} +
    

    Replace Project A and build_a with the actual project name. You can add other paths following the example of the ./Build folder.

    Pass it to the --tree-filter option of filter-branch:

    git filter-branch --tree-filter '...' --tag-name-filter cat --prune-empty -- --all