gitgit-mergegit-commitgit-merge-conflictgit-plumbing

git merge multiple commits into one in an orphan branch each commit in a prefix subdirectory


I need a merge more than 1 commit each from a branch or a remote repo into a single commit in another branch.

input branch#1: o--o- - -o   (C1)
                          \
input branch#2: o--o- - -o | (C2)
    :                     \|
input branch#N: o--o- - -o | (Cn)
                          \|
 output branch:   o--o- - -o (Cm)

I need to do it in a special way where the source tree of each input branch merge commit is a prefix or subdirectory in the source tree of the output branch merge commit:

<C1>       <C2>       ...  <Cn>
 |          |               |
 +- c1.txt  +- c2.txt       +- cn.txt

<Cm>
 |
 +- C1/c1.txt
 |
 +- C2/c2.txt
 |
 :     :
 |
 +- Cn/cn.txt

Additionally, I need to change some parameters of the merge commit, like author date, author email, etc and generate a commit message from commit messages of all input branches leave the parents of a merge commit as is without any changes (including parent commit hash list in a merge commit).

Digging in the internet I have already found the most universal solution with the minimal set of commands:

git merge --allow-unrelated-histories --no-edit --no-commit -s ours <input-branches-and-commits>
git read-tree --prefix=C1/ <C1-branch>
git read-tree --prefix=C2/ <C2-branch>
:
git read-tree --prefix=Cn/ <Cn-branch>
cat ... | git commit --no-edit --allow-empty --author="..." --date="..." -F -

But it does work differently when the output branch is an orphan branch. In that case the content of an input branch merges additionally into the root of the source tree of the output branch commit:

<Cm>
 |
 +- C1/c1.txt
 |
 +- c1.txt

Basically it happens when the input branch is the only input branch (I didn't test the case with the multiple input branches when the output branch is an orphan branch because I didn't have that case yet, but I don't exclude that).

I have found the reason why that happens. Because the head does not exist yet and can not exist including the output branch then the merge command creates it upon the call and in the same time leaves the merge incomplete with the output branch pointing to an input branch which actually makes the output branch the parent to itself. This brings the content of the source tree of an input branch into the root of the source tree of the output branch commit without a notice from the user.

I know at least one approach to avoid that behavior, for example, create an empty commit in the output branch before the merge which makes the orphan branch not orphan and initializes the head together with the reference to the output branch.

But I don't want that to do because I have to somehow remove that commit later which is actually workaround code to the git.

Does out there exist a good known way to deal with the git guts to make all things work and merge together as expected?


Solution

  • If you're going to use git read-tree to fill the index for the commit you're building—and yes, this is the easy way to add a prefix to each, just as you are doing—you are already deep in the innards of Git, so you might as well use git commit-tree to build the commit object.

    In other words, don't start with git merge at all. Just empty out the index with git read-tree --empty. Then read each commit Ci, 1 ≤ i ≤ n. Your index now contains the files you intend to put into this merge commit Cm.

    Then, instead of git commit, use git write-tree to turn the index into a tree object, followed by git commit-tree to embed the tree object in a new commit. Since git commit-tree allows you to specify each parent, you can make your N-way octopus merge directly:

    git read-tree --empty
    git read-tree --prefix prefix1 C1
    git read-tree --prefix prefix2 C2
    ...
    git read-tree --prefix prefixn Cn
    
    tree=$(git write-tree) || die ...
    commit=$(cat ... | git commit-tree -p C1 -p C2 -p C3 ... -p Cn) || die ...
    

    Last, attach a new branch name to the resulting commit:

    git branch the-final-result $commit
    

    and you have your commit Cm on this new branch.

    Edit: apparently I misread the question a bit, and you already also have one existing branch name B whose tip commit is currently commit CB. You should read this tree initially, instead of using git read-tree --empty, if you want to preserve its files, and then use that commit as one of the parents in the final git commit-tree and simply fast-forward that new commit to the existing branch name B. So:

    git read-tree Cm
    git read-tree --prefix prefix1 C1
      .
      .
      .
    git read-tree --prefix prefixn Cn
    
    tree=$(git write-tree) || die ...
    commit=$(cat ... | git commit-tree -p Cm -p C1 -p C2 ... -p Cn) || die ...
    git push . $commit:refs/heads/B  # or git branch -f B $commit
    

    Adjust per actual desired result.