gitgit-filter-repo

git-filter-repo multiple operations (regex filter, move files, keep other dir...)


I'm trying to filter a repository with git-filter-repo. I was trying to do this by describing all my needed operations in a path file to be used in a --paths-from-file stanza as described in the documentation but I'm stuck in a last step to finalize this in one single and easy step.

Situation and expected result

My git working directory looks like this:

$ tree
.
├── 00_a_playbook.yml
├── 01_other_playbook.yml
├── 01B_screw_that_fixed_numbering.yml
├── 02_playbook_2_which_is_now_3.yml
├── [...some more badly numbered playbooks...]
├── ansible.cfg
├── doc
│   ├── [...a bunch of docs...]
├── group_vars
│   ├── [...a bunch of var files...]
├── inventory
│   ├── [...a bunch of inventories...]
├── README.md
├── requirements.yml
└── roles
    └── my_role
        ├── [...all my role content...]

What I want to do:

  1. rename all numbered playbooks at directory root inside a playbooks dir dropping their number.
  2. keep only the roles and newly created playbooks dir at root of repo

So the expected tree is:

.
├── playbooks
|   ├── a_playbook.yml
|   ├── other_playbook.yml
|   ├── screw_that_fixed_numbering.yml
|   ├── playbook_2_which_is_now_3.yml
|   ├── [...rest of un-numbered playbooks...]
└── roles
    └── my_role
        ├── [...all my role content...]

What works

I actually succeeded but I need to run the tool twice to get the expected result.

If I create the two following paths files:

and then run:

git-filter-repo --paths-from-file ../paths_rename_playbooks.txt
git-filter-repo --paths-from-file ../paths_keep_folders.txt

I get exactly what I expect and described above.

Grouping paths directives in a single file fails

When I try to do all the above operations from a single files it fails.

# 2 above files merged in one
$ cat ../all_paths.txt
regex:^\d{2}.?_(.*\.yml)$==>playbooks/\1
roles/my_role
playbooks

# Single run with that file. No errors
$ git-filter-repo --paths-from-file ../all_paths.txt

$ tree
.
└── roles
    └── my_role
        ├── [... my role content ...]

As you can see my renamed playbook files are gone. Did I miss something in the way I'm describing those operations or do I have no other choice than running that tool iteratively to get what I want?


Solution

  • Self answering as laying out the problem helped me spot the issue. I missed one important note in the documentation about path renaming.

    Note: if you combine path filtering with path renaming, be aware that a rename directive does not select paths, it only says how to rename paths that are selected with the filters.

    So one must describe all paths to be selected in the original repo prior to renaming them. Hence in my case the regex must be defined twice: once as a simple path filter and the other as a rename. The following path file is perfectly doing the job in a single run:

    $ cat ../paths.txt
    # Paths to be kept
    regex:^\d{2}.?_.*\.yml$
    roles/my_role
    
    # Rename playbooks
    regex:^\d{2}.?_(.*\.yml)$==>playbooks/\1
    
    $ git-filter-repo --paths-from-file ../paths.txt