Multiple developers have contributed to a project over few years. A few commits ago one commit has changed the line endings in a single file from our standard LF (0x10) to CR (0x13). Reason unknown.
git log -p -- filename
command shows that in every recent commit 1 line was removed, 1 was added. The line is the whole file content with ^M
symbols in place of line breaks.
Smart code editors like VSCode are still able to display the code correctly broken into lines. GitLens "File history" correctly shows diff introduced by a commit. But per-line blame is not available.
I can fix line endings back to LF
with VSCode (two clicks), but after that I will be blamed for every single line in the file.
Seems feasible to fix this by manually walking through each commit, recovering its diff and somehow manually attributing changed lines blame to the commit author. Kind of re-playing the changes, but with proper line endings.
How to recover per-line authors, taking some old commit as a starting point, and applying the diffs from every commit since the line endings were broken?
To do this locally,
mkdir -p .git/info
echo "* diff=crnl" >.git/info/attributes
git config diff.crnl.textconv "sed -Ez 's,\r\n?,\n,g'"
which will cause any Git-controlled diff run in that particular repo to first translate Mac or Windows newlines to Unix newlines.
The wildcard pattern might be reaching too far, if you've got binaries checked in you'll want to come up with a better set of matching patterns.
edit: testcase:
sh -x <<\EOD
cd `mktemp -d`; git init
seq 5 >file; git add .; git commit -am-
sed 1s,^,x, -i file; git commit -am1x
tr \\n \\r <file | tee file; git commit -am croops
sed s,5,x5, -i file; git commit -am5x
git blame file
mkdir -p .git/info; echo '* diff=crnl' >.git/info/attributes
git config diff.crnl.textconv "sed -Ez 's,\r\n?,\n,g'"
git blame file
EOD