I'm trying to reconcile a very large feature branch with hundreds of commits that multiple teams and agencies worked on for a long time period. Unfortunately due to corporate policy issues it wasn't possible to enforce a consistent merging strategy across the teams, so now the branch is inconsistent, containing mostly squashed merges plus dozens of merge commits that create a messy commit history and conflicts on a simple git rebase -i
whenever individual commits within a merge conflict with temporally-adjacent commits from other merges that were worked on in parallel at a similar point in time.
I could manually squash all the commits and merge commits with something like git rebase -i --rebase-merges
, but that introduces a risk of human error as I need to manually re-order dozens of commits and merge-commits, and it'd be easy to accidentally squash a commit into the wrong merge-commit or in the wrong order.
Git already knows which commits belong to which merge commits, so it should be possible to remove the risk of human error by automating this. I hoped there would be something like git rebase --squash-merges
that does an autosquash on every commit into its merge commit as if that merge had been a squash-merge, but this feature doesn't appear to exist (but it might be possible to write a chain of commands that works this way using commands that list the commits of each merge commit?).
Is there some way I can get Git to go through the commits in a branch and automatically squash loose commits into their merge-commits in the right order, without risking human error by doing it manually?
Here's an example editted extract from around line 300 of the output of git rebase -i --rebase-merges some-branch
. Here we have two merges with one commit each and two squash-merges. Due to the time gap between the commits and their merges, regular rebases see merge conflicts between "make widget" and "Create McGuffin". :
pick af697c57 Release: 1.2.1
label branch-point-9
pick 3e50d3ae XYZ-32: Create McGuffin (#394)
label merge
# Branch org-XYZ-29-make-widget
reset branch-point-9 # Release: 1.2.1
pick 4d53b13c feat: make widget for XYZ-29
merge -C 0d9d6676 merge # merge
label org-XYZ-29-make-widget
# Branch merge-2
reset merge # XYZ-32: Create McGuffin (#394)
merge -C 04d11323 org-XYZ-29-make-widget # Merge pull request #400 from org/XYZ-29/make-widget
pick 060bbde5 XYZ-27: splice mainbrace (#399)
pick 50ae56cf XYZ-36: reticulate splines (#397)
label merge-2
I want to squash this (and every such case) down into something like this (which should have no conflicts because any conflicts were already resolved in the merges), using what Git already knows about which commits belong to to which merge commits and how the merges were reconciled:
pick af697c57 Release: 1.2.1
pick XXXXXXXX XYZ-32: Create McGuffin (#394) # commit squashed into its merge commit
pick YYYYYYYY Merge pull request #400 from org/XYZ-29/make-widget # commit squashed into its merge commit
pick 060bbde5 XYZ-27: splice mainbrace (#399)
pick 50ae56cf XYZ-36: reticulate splines (#397)
...which I think could be achieved manually in an interactive rebase by carefully re-ordering commits under their merge commits, squashing into those, and then picked the squashed merge as a regular commit? But doing this at scale adds a risk of human error.
Git stores snapshots and records one or more parents for each commit. Regular commits have a single parent and merge commits have two parents (or more). With that knowledge it should become clear that you want to rewrite the commits to only contain a single parent so that history becomes linear. Each merge commit's snapshot already points to the merge result.
The usual advice applies: do not rewrite published/shared history (and keep backups).
filter-branch --parent-filter
can do exactly that:
This is the filter for rewriting the commit’s parent list. It will receive the parent string on stdin and shall output the new parent string on stdout. The parent string is in the format described in git-commit-tree: empty for the initial commit, "-p parent" for a normal commit and "-p parent1 -p parent2 -p parent3 …" for a merge commit.
Thus, to only keep -p parent1
and drop all other parents:
git filter-branch --parent-filter 'cut -d" " -f1,2' -- main..yourbranch
This will turn history such as:
F < yourbranch
|\
D E
| |\
|/ /
B C
|/
A < main
into this (removing all but the first parent from the parent list):
F' < yourbranch
|
D E'
| |
|/
B C
|/
A < main
('
(prime) marks the "changed commits" – those that will end up with a different hash. They are actually new commit objects)
Note that you will end up with unreachable commits, but you can simply abandon them. The reachable history is only this:
F' < yourbranch
|
D
|
B
|
A < main
In a follow-up step, you might want to reword the rewritten merge commits with an interactive rebase to contain a sensible commit message.