gitgithubencodingformat-patch

Why are characters like "ã" appearing encoded in UTF-8 in Git and how to fix it?


I'm using Git to version my code, but I noticed that some commits have the author names encoded in UTF-8 in a strange way. For example, the author name "João" appears like this when I .patch the commit via the browser:

From: =?UTF-8?q?Jo=C3=A3o?= <joop011122@gmail.com>

How can I fix this and make author names appear correctly, without this encoding, when making commits in Git?


Solution

  • Your commit is still encoded in UTF-8. But the patch format that you are using there uses an encoding [1] for email headers. The patch format is geared towards sending patches via email. Email headers should not contain non-ASCII text. The commit author is put in the From header, hence Git has to encode the value if it has a non-ASCII value like “ã” (Latin small letter a with tilde).

    In summary there is nothing wrong with the commit message itself. It’s still in UTF-8. But GitHub (see tag) chose to use the patch format with this header encoding for displaying the commit. [2]

    Notes

    1. Q-encoding, described in RFC 2047
    2. Email header encoding can be turned off with git format-patch --no-encode-email-headers