We have hundreds of repositories and receive patches regularly from the upstream. A job applies these patches with git apply --check <patch>
. If there is no error, the patch is applied with git apply <patch>
and the changes are committed. If any error, the patch is labeled as conflict
. Then the errors and conflicted patches are delivered to our repository maintainers. They use git apply --reject <patch>
to apply patches and solve the conflicts.
To my previous understanding, git apply --reject
was reliable. However, one maintainer reports that a patch is applied in a completely wrong way. Some new lines are inserted to a chunk in an unexpected function, which happens to have the same context. And there are some other wrong chunks.
For example, the chunk in the patch is
@@ -1757,9 +1757,9 @@ def FunctionAAA()
print('hi')
}
+ print('hello world')
print('good day')
return True
But in the applied file, the chunk is
@@ -1927,9 +1997,9 @@ def FunctionBBB() ---> in another function
print('hi')
}
+ print('hello world')
print('good day')
return True
It's very likely that the maintainer doesn't notice the misplaced lines and it would result in build errors or even worse hidden bugs. I let the maintainer try git apply --3way <patch>
and the patch is applied as expected although there are still conflicts.
I think git apply --reject
and git apply --3way
behave differently because they use different algorithms. From the result, I guess we need to adopt git apply --3way
. But I'm also worried that --3way
could work unexpectedly in some cases.
Why does git apply --reject
work in a seemingly wrong way instead of considering the chunk as conflicted? Which is better in our case? Is there any better solution to apply patches? Thanks.
git version 2.31.1
ubuntu 4.15.0-76-generic
TL;DR: you do indeed want --3way
if possible.
There's some history here. The git apply
command was originally at least partly a clone, more or less, of Larry Wall's historical patch
command. This patch command always operates in --reject
mode (see the documentation: (POSIX), (non-POSIX)). When running in this mode, it never does a three-way merge.
On the other hand, patches have defects: the fuzz factor applied to context matches allows inserting the indicated changes even if the context doesn't actually match. (Git's apply
does not have fuzz.) The context-matching can go wrong, as it apparently did in your case, finding a similar looking function, but not the correct function. A three-way merge avoids these problems, by having three inputs:
Git can construct two of these versions using the Index:
line in a Git patch, which contains the blob hash ID of the base version of the file. Git simply uses the hash ID to find the correct blob object in the repository. If that object exists, that is the file they had as the "before" copy in their diff, so Git can extract that object, apply the patch exactly as it appears, and produce the "theirs" version of the file. Git can now do a normal three-way merge of the three files.
The --3way
option fails in two cases:
If there is no Index:
line giving the merge base version, there is no way for Git to know which copy of the file was the "before" version in the context diff.
If there is a valid index line but you do not have the object in your repository, Git cannot construct the base and theirs copies of the file.
In these cases, the only available option is the fallback: try to find the right context (and hope a lot and use --reject
if needed).