I have special filenames with escape \ characters stored in Git repository on Debian 10 Linux.
Problem: it is not possible to git checkout files on Windows, which have incompatible characters in the filename.
Example:
git log --all --name-only -m --pretty= '*\\*'
"systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"
I get following Git errors at Windows checkout:
C:\Git\bin\git.exe reset --hard "5ef1cac3a03304c35b455edf32bd1bb78060c5b9" --
error: invalid path 'systemd/system/default.target.wants/snap-git\x2dfilter\x2drepo-7.mount'
fatal: Could not reset index file to revision '5ef1cac3a03304c35b455edf32bd1bb78060c5b9'.
Done
Problem reproducing steps:
# Clone repository, to be executed on a safe repo:
git clone --no-local /source/repo/path/ /target/path/to/repo/clone/
# Cloning into '/target/path/to/repo/clone'...
# remote: Enumerating objects: 9534, done.
# remote: Counting objects: 100% (9534/9534), done.
# remote: Compressing objects: 100% (4776/4776), done.
# remote: Total 9534 (delta 4215), reused 8043 (delta 3136), pack-reused 0
# Receiving objects: 100% (9534/9534), 7.41 MiB | 16.78 MiB/s, done.
# Resolving deltas: 100% (4215/4215), done.
cd /target/path/to/repo/clone/
# List the files with escape \ from repo history into a list file:
git log --all --name-only -m --pretty= '*\\*' | sort -u >/opt/git_repo_files_w_escape.txt
# Remove the files with escape \ from repo history:
git filter-repo --invert-paths --paths-from-file /opt/git_repo_files_w_escape.txt
Parsed 592 commits
New history written in 0.25 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 71128f3 .gitignore: ADD snap-git to be ignored
Enumerating objects: 9354, done.
Counting objects: 100% (9354/9354), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3694/3694), done.
Writing objects: 100% (9354/9354), done.
Total 9354 (delta 4085), reused 9354 (delta 4085), pack-reused 0
Completely finished after 0.55 seconds.
# List files with escape \ to check result:
git log --format="reference" --name-status --diff-filter=A '*\\*'
# "systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"
# Unfortunately it seems filter-repo was executed, but log still lists filenames with escape \ :-(
Question:
1) How to remove all files from Git repo history with path having at least one escape \ character in filename?
(reason: it is not possible to checkout those files on Windows, which have incompatible characters in the filename)
UPDATE1:
Tried to replace \\x2d
string to - in input file list as suggested, but git history remove was still unsuccessful:
# List the files with escape \ from repo history into a list file:
git log --all --name-only -m --pretty= '*\\*' | sort -u >/opt/git_repo_files_w_escape.txt
# Replace \\x2d string to - in git_repo_files_w_escape.txt:
sed -i 's/\\\\x2d/-/g' /opt/git_repo_files_w_escape.txt
# Remove the listed files from repo history:
git filter-repo --invert-paths --paths-from-file /opt/git_repo_files_w_escape.txt
Parsed 592 commits
New history written in 0.25 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 71128f3 .gitignore: ADD snap-git to be ignored
Enumerating objects: 9354, done.
Counting objects: 100% (9354/9354), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3694/3694), done.
Writing objects: 100% (9354/9354), done.
Total 9354 (delta 4085), reused 9354 (delta 4085), pack-reused 0
Completely finished after 0.55 seconds.
# List files with escape \ to check result:
git log --format="reference" --name-status --diff-filter=A '*\\*'
# "systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"
# Unfortunately log still lists filenames with \\x2d :-(
UPDATE2:
Tried to replace \\x2d
in git_repo_files_w_escape.txt to \\\\x2d
or \x2d
but none of them resulted to remove the files having \\x2d
in filename from Git history.
UPDATE3:
I'm looking for a working solution based on git filter-repo.
Any more idea?
You fed bad input into filter-repo, based on a common but incorrect assumption about how git log works.
Look at your own output:
$ git log --format="reference" --name-status --diff-filter=A '*\\*'
"systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"
Let's look at the first line as an example. If you were to store that in a file, which you pass to --paths-from-file, then git-filter-repo is going to be looking for a file named "systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
to remove. You have no such file in your repository. Instead you have one named systemd/system/default.target.wants/snap-git\x2dfilter\x2drepo-7.mount
. (Note that I have removed both "
characters and two of the \
characters.)
The problem here is that you assumed git log would list filenames as-is, which it won't do whenever there are special characters. You can often get around this by setting core.quotepath=false (this particularly helps when you have non-ascii characters), but even that is ignored when you have backslashes.
Here's something that might work better for you for generating the list of filenames to exclude:
git log -z --all --name-only -m --pretty= '*\\*' | tr '\0' '\n' | sort -u >/opt/git_repo_files_w_escape.txt
but it assumes you do not have filenames with newline characters. (If you do have files with newline characters, though, then --paths-from-file won't work for you.)
Even simpler would be bypassing creating a list of files with bad names and just programatically removing them by pattern:
git filter-repo --filename-callback 'return None if b'\\' in filename else filename'