giteol

why does git treat some cpp files as binary?


here's output of git log :

* 5a831fdb34f05edd62321d1193a96b8f96486d69      HEAD (HEAD, origin/work, work)
|  LIB/xxx.cpp                        |  Bin 592994 -> 593572 bytes
|  LIB/xxx.h                          |    5 +++++
|  LIB/bbb/xxx.h                      |    9 +++++++++
|  LIB/aaa/xxx.cpp                    |  Bin 321534 -> 321536 bytes
|  LIB/aaa/yyy.cpp                    |   31 +++++++------------------------
|  tests/aaa/xxx.cpp                  |   29 +++++++++++++++++++++++++++++
|  tests/test_xxx.vcproj              |    4 ++++
|  7 files changed, 54 insertions(+), 24 deletions(-)

why is it treating some files as binary, and others not? This gives serious problems since git also doesn't want to automatically merge them.. Hence pretty much all merge/rebase/pull actions become a pain.

Here's the repo config:

[core]
  repositoryformatversion = 0
  filemode = false
  bare = false
  logallrefupdates = true
  symlinks = false
  ignorecase = true
  hideDotFiles = dotGitOnly
[remote "origin"]
  fetch = +refs/heads/*:refs/remotes/origin/*
  url = https://xxx/project.git
[branch "master"]
  remote = origin
  merge = refs/heads/master
[branch "work"]
  remote = origin
  merge = refs/heads/work
[svn-remote "svn"]
  url = xxxx
  fetch = :refs/remotes/git-svn

also core.autocrlf = false in the main .gitconfig.

edit I set core.autocrlf to true as suggested in the comments, but this doesn't seem to affect the next merge I'm after (maybe it's too late now to change autocrlf? or is it unrelated to the problem?):

> git merge work
warning: Cannot merge binary files: LIB/xxx.cpp (HEAD vs. work)

warning: Cannot merge binary files: LIB/aaa/xxx.cpp (HEAD vs. work)

Auto-merging LIB/xxx.cpp
CONFLICT (content): Merge conflict in LLIB/xxx.cpp
Auto-merging LIB/xxx.h
Auto-merging LIB/aaa/xxx.cpp
CONFLICT (content): Merge conflict in LIB/aaa/xxx.cpp
Automatic merge failed; fix conflicts and then commit the result.

Also now gits insist on changing lineendings in a couple of files (which is what I do not want).


Solution

  • Try adding the following line to your $repo/.git/info/attributes:

    *.cpp crlf diff
    

    You can specify it in gitattributes per-repo, per-user and per-system.


    Basic check-list

    • Do you actually have CRLF or LF line endings in your file?
    👉 Yes, CRLF — set core.autocrlf to true (at least for this repo).


    • Does the file contain funny non-ASCII characters: umlauts, diacritics, emoji, kanji, copyright sigil ©, invisible esoteric spaces, etc?..
    👉 If yes, better ensure that all the stuff is encoded in UTF-8. Fuzzing with surrogate pairs isn't fun.


    • Does the file content start with UTF-8 BOM?
    👉 Wipe it now, it makes no sense.


    • Does the file content start with UTF16 BOM?
    👉 Too bad; I've got no good advice for you at this point; sorry. Contact your system vendor.