gitnewlinecore.autocrlflf

Do I really need to specify all binary files in .gitattributes


I've read Git documentation that shows that I can explicitly set certain files to be treated as text, so their line endings are automatically changed or as binary to ensure that they are untouched.

However, I have also read that Git is pretty good at detecting binary files, which makes me think this is not needed. So my question is do I really need to specify these explicit settings for every single file extension in my repository? I've seen some recommendations to do so for all image file extensions.

# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto

# Explicitly declare text files you want to always be normalized and converted
# to native line endings on checkout.
*.c text
*.h text

# Denote all files that are truly binary and should not be modified.
*.png binary
*.jpg binary

Thanks to all for the answers, I've written up a blog post: .gitattributes Best Practices.


Solution

  • Git will check the first 8,000 bytes of a file to see if it contains a NUL character. If it does, the file is assumed to be binary.

    From git's source code:

    #define FIRST_FEW_BYTES 8000
    int buffer_is_binary(const char *ptr, unsigned long size)
    {
        if (FIRST_FEW_BYTES < size)
            size = FIRST_FEW_BYTES;
        return !!memchr(ptr, 0, size);
    }
    

    For text files, unless you intentionally insert a NUL character for some reason, they'll be correctly guessed. For binaries, it's more than likely that the first 8,000 bytes will contain at least a single instance.

    For the most part, you shouldn't need to declare a file's type explicitly (I don't think I ever have). Realistically, just declare a specific file if you run into an issue.