I've read Git documentation that shows that I can explicitly set certain files to be treated as text, so their line endings are automatically changed or as binary to ensure that they are untouched.
However, I have also read that Git is pretty good at detecting binary files, which makes me think this is not needed. So my question is do I really need to specify these explicit settings for every single file extension in my repository? I've seen some recommendations to do so for all image file extensions.
# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto
# Explicitly declare text files you want to always be normalized and converted
# to native line endings on checkout.
*.c text
*.h text
# Denote all files that are truly binary and should not be modified.
*.png binary
*.jpg binary
Thanks to all for the answers, I've written up a blog post: .gitattributes Best Practices.
Git will check the first 8,000 bytes of a file to see if it contains a NUL character. If it does, the file is assumed to be binary.
From git's source code:
#define FIRST_FEW_BYTES 8000
int buffer_is_binary(const char *ptr, unsigned long size)
{
if (FIRST_FEW_BYTES < size)
size = FIRST_FEW_BYTES;
return !!memchr(ptr, 0, size);
}
For text files, unless you intentionally insert a NUL character for some reason, they'll be correctly guessed. For binaries, it's more than likely that the first 8,000 bytes will contain at least a single instance.
For the most part, you shouldn't need to declare a file's type explicitly (I don't think I ever have). Realistically, just declare a specific file if you run into an issue.