Suppose I have the following git history: a master branch starting with commit A
, a feature-1
branch branched off of A
with commits B
and C
, and a second feature branch feature-2
that built off of commit C
with commits D
and E
.
master A
\
feature-1 B--C
\
feature-2 D--E
Now suppose that commit C
has been tested and is ready to merge in, so we use git switch master; git merge feature-1 --squash
.
master A------C'
\ /
feature-1 B--C
\
feature-2 D--E
The history for master is nice and clean with just commits A
and C'
, but if we now want to compare master
and feature-2
(e.g., git log master..feature-2
) we end up seeing all of the commits from feature-1
that were already merged in.
Question 1: Is there an easy way to squash the history for feature-2
to match the squashed merge? What if the history is a little more complicated and there were more commits after the branch point C
on feature-1
that were squash-merged into master?
Question 2: Assuming that rewriting history is hard (or can only be tediously done with a git rebase -i
; I've got way more than two commits on each branch), is there any way to view only the commits in feature-2
that weren't squash-merged into master? When performing a pull request on GitHub or Bitbucket for feature-2 -> master
, is there any way to only list those genuinely new commits?
Now suppose that commit
C
has been tested and is ready to merge in, so we usegit switch master; git merge feature-1 --squash
.master A------C' \ / feature-1 B--C \ feature-2 D--E
This drawing isn't quite right: it should read the way I've drawn below. Note, I've moved the names to the right as well, for reasons that should become clearer in a moment. I also called the squash commit BC
, which is an attempt to make it clear that there is a single commit that does what B-and-C did together.
What you drew was a real merge (although you called the merge commit C'
). As matt said, a "squash merge" isn't a merge at all.
A--BC <-- master
\
B--C <-- feature-1
\
D--E <-- feature-2
At this point, there's almost no reason to keep the name feature-1
. If you delete it, we can redraw the graph like this:
A--BC <-- master
\
B--C--D--E <-- feature-2
Note that commits A-B-C-D-E
are all on branch feature-2
(regardless of whether we delete the name feature-1
); commit BC
is only on master
.
The main reason to retain the name feature-1
is that it identifies commit C
, which makes it easy to copy commits D
and E
(and no others) to new and improved commits D'-E'
.
Question 1: Is there an easy way to squash the history for
feature-2
to match the squashed merge?
It's not completely clear to me what you mean by "squash the history". Having run the above git merge --squash
, though, the snapshot in commit BC
will match (exactly) the snapshot in commit C
, so running:
git switch feature-2 && git rebase --onto master feature-1
(note the --onto
here1) will tell Git to copy commits D
and E
(only) with the copies going after commit BC
, like this:
D'-E' <-- feature-2 (HEAD)
/
A--BC <-- master
\
B--C <-- feature-1
\
D--E [abandoned]
It's now safe to delete the name feature-1
as we no longer need something to remember the hash ID of commit C
. If we stop drawing in the abandoned commits, we end up with:
A--BC <-- master
\
D'-E' <-- feature-2
which might be what you wanted.
1Normally, git rebase
takes one name or commit hash ID. It then:
git switch --detach
on a commit hash ID;git switch
back to the branch name just moved in step 4.When not using --onto
, the commit hash IDs in steps 1 and 2 are the same. When using --onto
, the commit hash IDs in steps 1 and 2 are, or at least can be, different. So with --onto
we can tell Git: Only copy some commits, rather than many commits.
Specifically, without --onto
, we'll copy all the commits that are reachable from HEAD
, but not reachable from the (single) argument, and the copies will go to the (single) argument. With --onto
, we can say: Copy commits reachable from HEAD
but not from my specified limiter, to the place specified by my separate --onto
argument. In this case that lets us say do not attempt to copy B
and C
.
On the other hand, you can also simply run:
git switch master # if needed - you're probably already there
git merge --squash feature-2
if you just wanted a single squash-merge of the D-E chain:
A--BC--DE <-- master (HEAD)
\
B--C <-- feature-1
\
D--E <-- feature-2
This git merge --squash
will usually go smoothly as well, because, like regular git merge
, git merge --squash
starts by:
A
in this case);BC
, because HEAD
is master
which identifies commit BC
); andE
, because feature-2
names commit E
).The first diff shows what B+C
did because BC
's snapshot matches C
s, and the second shows what B+C+D+E
did, because E
's snapshot is the result of B
plus C
plus D
plus E
. So unless D
and/or E
specifically un-does something B
and/or C
did, the two sets of changes are likely to merge automatically.
(Note that the rebase always goes smoothly here, even if D
and/or E
undo something.)
The difference between a squash-not-really-a-merge and a real merge is limited to the final commit: the squash has a commit with a single parent, in this case BC
, while a real merge has a commit with two parents. In this case a real merge would give you BC
as one parent, and E
as the other. You probably want to rebase away the B
and C
commits first if you like having the BC
squash-merge.
What if the history is a little more complicated and there were more commits after the branch point
C
on feature-1 that were squash-merged into master?
As always, the trick is to draw an actual graph. We might start with this:
A <-- master
\
B--C--F--G <-- feature-1
\
D--E <-- feature-2
which, after git switch master && git merge --squash feature-1
, produces:
A--BCFG <-- master
\
B--C--F--G <-- feature-1
\
D--E <-- feature-2
It's now appropriate to use:
git switch feature-2 && git rebase --onto master feature-1
Note that this is the same command that we used in the earlier situation. It says (compare with the steps in footnote 1 above):
List out commits reachable from feature-2
(where we are after the git switch
) but not from feature-1
. The commits reachable from feature-2
are A-B-C-D-E
, and the commits reachable from feature-1
are A-B-C-F-G
. Subtracting A-B-C-F-G
from A-B-C-D-E
leaves D-E
.
Get onto a detached HEAD at master
, i.e., commit BCFG
.
Copy the commits listed in step 1, i.e., D
and E
.
Yank the branch name (feature-2
) around to where we are now (commit E'
).
Do the equivalent of git switch feature-2
again.
The result is:
D'-E' <-- feature-2 (HEAD)
/
A--BCFG <-- master
\
B--C--F--G <-- feature-1
\
D--E [abandoned]
after which it's safe to delete the name feature-1
: we no longer need an easy way to find commit C
via commit G
any more.
Question 2: Assuming that rewriting history is hard (or can only be tediously done with a
git rebase -i
; I've got way more than two commits on each branch) ...
As you can see above, this isn't necessarily a correct assumption. How hard the rebase is depends on how many merge conflicts you get with each to-be-copied commit, which depends on what happened after the last common commit (C
in the drawings above). Still:
... is there any way to view only the commits in
feature-2
that weren't squash-merged intomaster
?
The git log
command has a simple syntax for this, as long as you still have the name feature-1
identifying the appropriate commit, as in the various drawings above:
git log feature-1..feature-2
does just that. This syntax means all commits reachable by starting at feature-2
and working backwards, minus all commits reachable by starting at feature-1
and working backwards. Note that this is the same set of commits that we copied with our git rebase
operations in the examples above.2
When performing a pull request on github or bitbucket for
feature-2 -> master
, is there any way to only list those genuinely new commits?
No, because these systems do not have the equivalent syntax. However, once you use rebase to copy just the desired commits, and force-push to make the GitHub or Bitbucket repository match, they'll show what you wanted.
2Not mentioned above is the fact that git rebase
deliberately omits certain commits in step 1 by default. In your case, there are no commits that should be omitted here, so this is not really relevant, but it is worth mentioning:
git rebase
omits all merge commits.git rebase
also uses the same computations that git cherry
or git log --cherry-pick
would use to eliminate from the copying any commits whose patch-id matches a commit in the upstream set of commits. (This set is hard to define without getting into the details of how the A...B
symmetric difference notation works.) In your case that doesn't matter either, because this kind of patch-ID matching is extremely unlikely here. It's meant more for the case where someone upstream deliberately used git cherry-pick
to copy one or more of your commits to the branch on which you are going to rebase.git rebase
defaults to running git merge --fork-point
to find commits to omit, and this can produce surprising results.The rebase documentation has historically been lax about mentioning these, probably because they don't come up all that often. In your case, they should not come up. The latest rebase documentation is greatly improved.