gitmerge

git merge: how did I get a conflict in BASE file?


I have the following in the BASE file of a git merge:

<<<<<<<<< Temporary merge branch 1
    - _modifiedTimeWeak = 4.25.2019::11:41:6;
    - _lastID = 3;
    - weakCGTime = 4.25.2019::11:42:20;
    - strongCGTime = 1.2.1990::0:0:0;
=========
    - _modifiedTimeWeak = 5.1.2019::8:52:36;
    - _lastID = 3;
    - weakCGTime = 5.1.2019::8:52:36;
    - strongCGTime = 3.20.2019::17:13:20;
>>>>>>>>> Temporary merge branch 2

I've performed a baseless merged of the file now, so there are no outstanding issues but I would like to understand what could have gone wrong.

I have checked the BASE commit as identified by git merge-base, and it did not include merge conflicts as presented, so that is ruled out. This is also the first time this has been encountered despite many merges having been done in this repository prior to this occurring.

It may be worth noting that I was using git merge merge-tool.

What could cause a BASE file when performing a merge to have merge conflicts appear within it, and what steps can be taken to avoid this from occurring in the future?


Solution

  • Jargon answer

    These occur when there are multiple merge bases and merging the merge bases produces a merge conflict. The merge base candidates are the LCAs of the commits you choose to merge, in the subgraph induced by the commits:

    DEFINITION 3.1. Let G = (V; E) be a DAG, and let x; yV. Let Gx; y be the subgraph of G induced by the set of all common ancestors of x and y. Define SLCA(x; y) to be the set of out-degree 0 nodes (leafs) in Gx; y. The lowest common ancestors of x and y are the elements of SLCA(x; y).

    (see https://www3.cs.stonybrook.edu/~bender/pub/JALG05-daglca.pdf). Here G is the directed acyclic graph formed by the commits in your repository, and x and y are the two commits you're choosing to merge. I actually prefer definition 3.2 a bit, though it uses posets, and might be even more jargon-y: it feels more relatable (and in fact is used for genealogy, which is what Git is doing).

    The recursive merge strategy, -s recursive, uses all the merge bases, merging each merge base—and committing the result, complete with merge conflicts—and using this temporary commit as the new merge base. So that's the answer to the first part of the question ("What could cause a base file ... to have merge conflicts").

    To avoid this, you have several options, but let's describe the problem more understandably first.

    Long but more useful answer

    You mention that you used git merge-base. By default, git merge-base picks one of the best-merge-base commit candidates and prints that one's hash ID. If you run git merge-base --all on the two branch tip commits, you'll see that there are multiple best-merge-base commit candidates.

    In a typical, easy, branch-and-merge pattern we have:

                 o--o--o   <-- branch-A
                /
    ...--o--o--*
                \
                 o--o--o   <-- branch-B
    

    The common merge base—as found by either 3.1 or 3.2 in the cited paper; with 3.2, you can just walk back from the two branch tips until you find a commit that's on both branches—is of course commit * and Git merely needs to diff * against the two tip commits of branch-A and branch-B respectively.

    Not all graphs are so neat and simple. The easiest way to get two merge bases is to have is a criss-cross merge in the history chain, as illustrated here:

    ...--o--o--*---o--o--o   <-- branch-C
                \ /
                 X
                / \
    ...--o--o--*---o--o--o   <-- branch-D
    

    Note that both starred commits are on both branches, and both commits are equally close to the two branch tips. Running git merge-base --all branch-C branch-D will print the two hash IDs. Which commit should Git use as the merge base?

    Git's default answer is: Let's use them all! Git will run, in effect:

    git merge <hash-of-first-base> <hash-of-second-base>
    

    as a recursive (inner) merge. This merge can have conflicts!

    If the merge does have conflicts, Git doesn't stop and get help from you, the user. It just commits the conflicted result. This becomes the input to the outer git merge, the one you're directly asking-for:

    ...--o--o--*---o--o--o   <-- branch-C
                \ /
                 M   <-- temporarily-committed merge result
                / \
    ...--o--o--*---o--o--o   <-- branch-D
    

    The temporary commit isn't actually in the graph, but for the purpose of your outer merge, it might as well be.

    How to avoid the problem

    Now that we see how the problem arises, the ways to avoid it are clearer—not exactly clear, but clearer than they were, at least:

    In any case there's no single right answer, which is the same as with any merge conflict. The issue is that you're now resolving a conflict that probably should have been resolved at some point in the past, when these two conflicting merges were made.