gitgit-diffgit-cherry-pick

Can git cherry-pick be replicated *exactly* with (git diff | git apply)?


I have a quite subtle script that's trying to patch differences from a formatted version of a text file to a "legacy" unformatted version. To help avoid issues with extraneous diffs due to format changes, I arrange the git tree to look like:

C o   L' o --- restored (unformatted) legacy version
  |      |
  |      |
  `------o --- auto-formatted legacy version
       F |
         |
         o --- original legacy version
       L |

where:

and all the branches are clean.

Then I cherry-pick C into the L' branch with -Xtheirs to just accept the changes in C.

The 3-way diff works as hoped and because F is the merge base, the format related changes outside the manual changes are ignored (note that git merge is absolutely wrong in this case since it want's to recreate the format diffs).

This is almost 100% what I want, and works really well, but sometimes drags format-only changes into the legacy file if they were adjacent to the manual changes, and I'd like to see if I can do something to reduce that.

Since cherry-pick doesn't have all the options available compared to diff and apply, I was thinking about playing around and trying to fine-tune things. However I'm not 100% sure if cherry-pick really is, absolutely definitely, going to behave exactly like some form of diff | apply in this case.

The current line is:

git cherry-pick --strategy=ort -Xtheirs <C-branch>

while on the L branch.

I tried a few version of git diff | git apply and things seem to work, but since I want to carefully comment how this could differ from the default behaviour of cherry-pick, I first need to know what the exact equivalent cherry-pick command would be.

I see a lot of places saying "it's a bit like it", but I've not found a definitive statement of whether it's identical (and if it's identical, what flags are effectively being used by git cherry-pick).

Yes, I am obviously aware that for git diff | git apply to work, all the blobs must be in the index (in this case they clearly are) and that you can't replace diff | apply with a cherry-pick in the general case, but in the case I describe, are they equivalent?

The closest answer I found was: Is git cherry-pick actually the same as git show + git apply?

But it's still non-canonical since it says "functionally equivalent for normal cases".

Please only answer if you have a canonical response which is 100% trustworthy. I've already read dozens of non-canonical sites talking about this in vague terms, and I'm happy if someone can show they aren't identical and why (I can still document that).

Edit: As some people seem under the impression I was going to use git diff | git apply without options, that's absolutely not the case. What I want to know is, is there a set of options which can make diff | apply work exactly like cherry-pick?

If there is, I can start there and tweak other options (e.g. -U, --no-indent-heuristic, --full-index, --inter-hunk-context etc.) to try and get better results, but if there is no such set of options, I might just stick with cherry-pick since it's mostly good enough most of the time.


Solution

  • What I want to know is, is there a set of options which can make diff | apply work exactly like cherry-pick [in this case]?

    For the single-file history you're showing, with branches I'll call C and L here, with L (the L' tip) checked out, git diff-tree -p C~ C file | git apply -3 --theirs is exactly equivalent to git cherry-pick -Xtheirs C (with or without --strategy=ort, which is now the default).

    To see this, consider that when dealing with single tracked files the commands are quite simple: git apply -3 and git cherry-pick C both wind up invoking a low-level merge function on three versions of that single file. Both will pass the "theirs" strategy option along to the low-level merge. The two commands both supply the exact same "ours" and "base" versions; for the "theirs" version, the incoming changes, git cherry-pick C uses C's version as-is and git apply -3 uses what a straight patch application of the diff to the "base" version produces… with the kicker being at factory-default settings, git diff base theirs by definition shows you a patch which, when applied to the "base" version gives you the "theirs" version. Exactly.

    apply -3 and cherry-pick invoke the exact same 3-way merge function and (when a vanilla C~ C diff is passed to the apply) supply the exact same versions. cherry-pick passes along a lot of convenience options to drive processing that simply isn't relevant here: you're not handling renames and copies, you're not using merge renormalization, there's no possibility of a conflict since you're telling it to pick the "theirs" side as the resolution.

    I'm not 100% sure if cherry-pick really is, absolutely definitely, going to behave exactly like some form of diff | apply in this case.

    It is: specifically the above ~diff-tree -p | apply -3~. You can see by the above considerations and you can verify in the source that the description is accurate. If necessary fire up gdb and set a breakpoint on ll_merge.