gitbranchbranching-strategy

Looking to incorporate changes from sub-branch into branch but both contain same commits


Definitely a neophyte when it comes to git, hence what probably is a pretty bone-headed question to follow here:

I have a master branch that I started working on an enhancement in. Twenty-something or so commits in I realized I should have made a separate branch and done my work on it.

So I created a new main/dev branch off of master and then I reset the master branch pointer to point to the initial repo commit which is the point before I started doing my work:

$ git reset --hard <commit hash>

Immediately after I did this I got the idea of the following workflow (not sure from where--lol):

My idea was to do all the main development along my own "main" branch and not on master. I would then create new feature/enhancement branches off from this main, I guess dev branch? I would then eventually reincorporate the new enhancement work back into the "main" branch, and then when all of my own "main" work is done merge that into master(Not sure if this is a good workflow... doesn't seem so).

So immediately after this great idea I thought "hmmm...let me just create another separate specific "enhancement" branch off of main.

I think I may have created the "sub" enhancement branch incorrectly. While on "main" I just used

$ git checkout -b enhancement

not

$ git checkout -b enhancement main

Now when I'm on branch "main" and run:

$ git merge enhancement

I get the following:

Already up to date.

When I run

$ git show-branch 

which to my understanding, is used to show all branches and all their commits, I get the following output:

! [enhancement] Test enhancing, fixing, and cleaning.
 * [main] Test enhancing, fixing, and cleaning.
  ! [master] Grammar improvements
---
+*  [enhancement] Test enhancing, fixing, and cleaning.
+*  [enhancement^] Bug Fix: addition of path id to updateTodo method.
+*  [enhancement~2] Additon of tests to TodoTest.java and TodoResourceTest.java for ValidForUpdate Todo annotation.
+*  [enhancement~3] Refactoring: moved all custom validation messages to properties file and all associated changes. Also cleaned up imports here and there.
+*  [enhancement~4] Refactoring: moved PastOrPresentValidator to an inner class of the PresentOrPast annotation class.
+*  [enhancement~5] Addition of ValidForUpdate custom constraint Todo object annotation and freinds.
+*  [enhancement~6] Fixed problem of same four tests failing when run with all tests in class but passing individually. Changed init from 'BeforeClass' to 'BeforeEach' fixed it.
+*  [enhancement~7] Added properties to enable forked & threaded test processing in surefire plugin.
+*  [enhancement~8] Added propertey for maven surefire plugin version.
+*  [enhancement~9] Refactoring of tests. Move of mian Todo template test object and friends to central location of TestTodoCreater.
+*  [enhancement~10] Refactoring to get PresetntOrPast validation error message from properties file. Also some slight refactoring by replacing validation error message string literals in tests with string constant.
+*  [enhancement~11] Addition of new 'PresentOrPast' validator and accompanying test changes. Also a little bit of test clean-up/refactoring.
+*  [enhancement~12] Addition of 'updateTodo' api method and accompanying tests. Tightening up of validation annotations in TodoResource. Addition of Todo field validations and all accompanying tests. Major refactoring of TodoResourceTest.
+*  [enhancement~13] Backed out previous commit's changes and updated dropwizard version to 1.3.13. Changes backed out due to what appears to be lack of Hibernate Validator 6 support in the dropwizard-testing library.
+*  [enhancement~14] Moved ValidationMessages.properties file. Still not yet using.
+*  [enhancement~15] updated hibernate-validator and java validation-api to latest versions.
+*  [enhancement~16] Changes required to add 'update' method to TodoDAO and some code cleanup.
+*  [enhancement~17] Changes in order to get maven to run ALL (Junit5) tests with 'mvn test'. Specific changes to get TodoIntegrationSmokeTest to run by adding test-config.yml.
+*  [enhancement~18] Completion of TodoIntegratoinSmokeTest for pre-existing api methods.
+*  [enhancement~19] Initial commit of TodoIntegratoinTest with working testCreateTodo method.
+*  [enhancement~20] Some minor code cleanup up and refactoring.
+*  [enhancement~21] completed tests for pre-existing api methods.
+*  [enhancement~22] Cleaned up TodoResource test and TestUtils.
+*  [enhancement~23] All API methods test-covered 'except' for deleteTodo.
+*  [enhancement~24] Validation for TododResource#getTodos and accompanying test changes .
+*  [enhancement~25] Began shoring-up code with parameter validation and accompanying test changes.
+*  [enhancement~26] Completed bulk of TodoDaoTest but missing update implementation and some odds and ends.
+*  [enhancement~27] A little more clean up/refactoring of TodoDAO associated test classes.
+*  [enhancement~28] Clean up of TodoDAOTest2
+*  [enhancement~29] Slight modification to server rootPath value.
+*  [enhancement~30] Addition of beginning of Todo DAO test classes. Creates a PostgresSQL Docker container and laods the schema with flyway.
+*  [enhancement~31] Replaced TodoResource string literal test path values with static UriBuilders.
+*  [enhancement~32] Changes in this commit are in support of getting the TodoResourceTest completed by covering all original api methods(given with task).
+*  [enhancement~33] Small change. Renamed Todo JSON test class to 'TodoRepresentationTest' to use Dropwizard-specific terminology. Removed non-used import.
+*  [enhancement~34] Conversion of Todo class into an immutable 'value' class. Added tests for serializing and deserializing Todo objects to and from JSON.
+*+ [master] Grammar improvements

Unless I'm misinterpreting the output of this command, it looks like "main" contains all the same commits as "enhancement" which makes it look like any commits I've been making to the enhancement branch have also somehow been being committed to the main branch.

It looks like I've been working in and committing the same stuff to two branches at the same time.

Also after running

git show-ref --head

I get:

b43e3c2b3d19a4a19497cf78e3909727f25796a2 HEAD
b43e3c2b3d19a4a19497cf78e3909727f25796a2 refs/heads/enhancement
b43e3c2b3d19a4a19497cf78e3909727f25796a2 refs/heads/main
...

Which indicates that HEAD is pointing to both branches at the same time? How did this happen???

Also when I run the command to show the different commits between the two branches:

$ git log --left-right --graph --cherry-pick --oneline main...enhancement

There is no output. So this is saying there is no difference between the two branches at all. So I'm really scratching my head as to what I did here.

I love git but she confounds me again :(.

I've read up on "branches" in git, and from what I can tell a branch is just a pointer to a specific commit, and whatever branch HEAD points to is the one your currently working in and so upon committing will just add successive commits on top of HEAD while advancing HEAD to the most recent commit in that branch.

So the HEAD of my enhancement branch should be way ahead of the HEAD of my main branch containing commits main does not.

The thing I don't understand is how come when I switch to either branch the HEAD is pointing to the same commit if I've been working in the enhancement branch for the last 17 commits.

So what I'm expecting to see is different commit histories with enhancement containing more commits, but I don't, they have the same commits!

Please, someone have mercy and shed some light by pointing out to me what I'm doing wrong here. I've been banging my head on this for a looong time.

Any and all help immeasurably appreciated!


Solution

  • The root of your issue is that branches are not hierarchical.

    Commits have a graph structure. In particular, almost every commit has a parent, with some having two parents. Some oddball commits might have three or more, and at least one—the very first commit ever made in the repository—necessarily has no parents.

    It's this parent/child relationship that forms the graph structure. We might start with a tiny repository, with just three commits:

    A <-B <-C
    

    The actual names of the three commits are three big ugly hash IDs, but we can use single uppercase letters to stand in for those hash IDs, as long as we don't make more than a few dozen commits (how many we'll get depends on how many letters are in your alphabet: do you have both O and Ö, for instance?). The last commit, C, contains the actual hash ID of commit B, so we say that C points to B. B is C's parent. Commit B contains A's hash ID as B's parent, so B points to A. A was the first commit so it cannot point to anything earlier, and does not; here, the action stops.

    To add a new commit to this repository, we would save a new source code snapshot, add our name and email address and the date-and-time-stamp, save a log message, and save C's hash ID, all into a new commit, which is automatically assigned its own new, unique hash ID that we'll call D, and now we have:

    A <-B <-C <-D
    

    But where do branch names come into this picture? Well, remember, each commit has some big ugly random-looking hash ID. Here are four actual hash IDs, in some order:

    7c20df84bd21ec0215358381844274fa10515017
    14fe4af084071803ab4f16e6841ff64ba7351071
    c62bc49139f1d18e922fc98e35bb08b1aadbcafc
    9b274e28871b3e4a4109582a34625df5fddc91c8
    

    Which of these should I call commit A, which one should I call B, and so on? If I want to start at the end—at the latest commit, as Git does, do I start with 14fe..., or with 9b27..., or what?

    We could look inside each of these four commits to see what parent hash ID they store. For instance:

    $ git cat-file -p 9b274e28871b3e4a4109582a34625df5fddc91c8 | sed 's/@/ /'
    tree c921299d1381a3bd6486ef999e3cc432118d1d72
    parent e46249f73ebddca06cf16c01e8de1f310360c856
    parent f3eda90ffc10f9152e7492a34408a9f5e4c28b0f
    author Junio C Hamano <gitster pobox.com> 1564776722 -0700
    committer Junio C Hamano <gitster pobox.com> 1564776722 -0700
    
    Merge branch 'jc/log-mailmap-flip-defaults'
    
    Hotfix for making "git log" use the mailmap by default.
    
    * jc/log-mailmap-flip-defaults:
      log: really flip the --mailmap default
      log: flip the --mailmap default unconditionally
    

    tells me that commit 9b274e28871b3e4a4109582a34625df5fddc91c8 has two parents, e46249f73ebddca06cf16c01e8de1f310360c856 and f3eda90ffc10f9152e7492a34408a9f5e4c28b0f, neither of which is one of the four I listed. If I look at every commit in the repository, and gather up all of their parent lines, I can—eventually, after a lot of work—figure out which commits are at the end. But that's very slow.

    Git's answer to this is branch names. A branch name simply holds the hash ID of one (1) commit:

    $ git rev-parse master
    7c20df84bd21ec0215358381844274fa10515017
    

    Aha, that's one of the four commits I listed above! That commit is the last commit that Git should consider to be part of master. If we look inside it:

    $ git cat-file -p 7c20df84bd21ec0215358381844274fa10515017 | sed 's/@/ /'
    tree 8858576e734aa4f1cd9b45e207e7ee2937488d13
    parent 14fe4af084071803ab4f16e6841ff64ba7351071
    author Junio C Hamano <gitster pobox.com> 1564776744 -0700
    committer Junio C Hamano <gitster pobox.com> 1564776744 -0700
    
    Git 2.23-rc1
    
    Signed-off-by: Junio C Hamano <gitster pobox.com>
    

    we see that this commit has exactly one parent, 14fe4af084071803ab4f16e6841ff64ba7351071, which is another of my four hash IDs. So master is the end of the chain, and this other big ugly hash ID that starts with 14fe... is the next one back:

    ...--G--H   <-- master
    

    H is the 7c20... commit, and G is the 14fe... commit. Let's see G's parents, this time using git rev-parse and its special "print all the parent hash IDs" syntax:

    $ git rev-parse 14fe4af084071803ab4f16e6841ff64ba7351071^@
    c62bc49139f1d18e922fc98e35bb08b1aadbcafc
    d61e6ce1dda7f4b11601a0de549feefbcec55779
    

    The c62b... one is the third of my list of four; there's another one that's not in my list. This commit is a merge commit, as we can see if we look at the rest of it:

    $ git cat-file -p 14fe4af084071803ab4f16e6841ff64ba7351071 | sed 's/@/ /'
    tree 06a0b1de4cb3857cdd23a939a857dc720240496b
    parent c62bc49139f1d18e922fc98e35bb08b1aadbcafc
    parent d61e6ce1dda7f4b11601a0de549feefbcec55779
    author Junio C Hamano <gitster pobox.com> 1564776723 -0700
    committer Junio C Hamano <gitster pobox.com> 1564776723 -0700
    
    Merge branch 'sg/fsck-config-in-doc'
    
    Doc update.
    
    * sg/fsck-config-in-doc:
      Documentation/git-fsck.txt: include fsck.* config variables
    

    We can call c62bc49139f1d18e922fc98e35bb08b1aadbcafc commit F. We might want some other letter than E for d61e6ce1dda7f4b11601a0de549feefbcec55779; let's use I:

    ...--F--G--H
           /
     ...--I
    

    Now let's put in the branch name master. The name itself identifies commit H, and only directly means commit H, so:

    ...--F--G--H   <-- master
           /
     ...--I
    

    We can see from G's log message that Junio Hamano made G by running:

    git merge sg/fsck-config-in-doc
    

    so let's draw in that name too, pointing to commit I:

    ...--F--G--H   <-- master
           /
     ...--I   <-- sg/fsck-config-in-doc
    

    The interesting thing about this is that commits F, G, and H are all on master ... but what it means for a commit to be "on a branch" is that we can start at the end—at commit H, in this case—and work backwards and reach that commit. So from H, we walk back the one and only step that we can, and see G. From G, we can step back to both F and I. So not only is commit F on master, so is commit I.

    Meanwhile, sg/fsck-config-in-doc is a branch name. It identifies commit I. So commit I is not only on master, it's also on sg/fsck-config-in-doc.

    This is the first thing about branch names. They're just labels. They just act as ways to let Git get started in terms of looking at the graph. A branch name identifies one particular commit, which we call the tip commit of that branch. That commit identifies some parent commit or commits, and those commits are also on the branch. By moving to, or looking at, a parent, we find some more parents; those are also on the branch.

    To make a new commit on some branch, we start by having Git select that particular branch name as the current branch. To create a new branch name, we have Git select some commit—the default being the current commit—and make a new name, pointing to that one specific commit. So if we have:

    ...--F--G--H   <-- master
           /
     ...--I   <-- sg/fsck-config-in-doc
    

    and ask Git to create a new name topic, we get:

    ...--F--G--H   <-- master, topic
           /
     ...--I   <-- sg/fsck-config-in-doc
    

    Commits through I, plus those thorugh I, are now on both master and topic. If we select topic as the current branch, Git extracts commit H if needed and attaches the name HEAD to the name topic:

    ...--F--G--H   <-- master, topic (HEAD)
           /
     ...--I   <-- sg/fsck-config-in-doc
    

    and when we make a new commit, its parent will be H—the previous tip of topic—and the new commit will become the one that the name topic identifies:

                 J   <-- topic (HEAD)
                /
    ...--F--G--H   <-- master
           /
     ...--I   <-- sg/fsck-config-in-doc
    

    Note that HEAD is still attached to the branch name, but the branch name has moved.

    Moving a branch name does not affect any of the commits in the repository! They all exist before you move the name, and they continue to exist after you move the name. No matter where you move the name, the graph itself remains unchanged (though in order to make the arrow coming from the name point to the right commit, you might sometimes want to draw the graph a little differently).

    Let's say we decide to make topic point to I instead of J:

                 J
                /
    ...--F--G--H   <-- master
           /
     ...--I   <-- sg/fsck-config-in-doc, topic (HEAD)
    

    Commit J still exists, but now it will be very hard to find. If you start at master and work backwards, you get H, then G, then F and I, and so on. You can't reach J: the arrows point the wrong way. If you start at the tip of topic, which is now I, and work backwards, you can't reach J. Commit J is, in effect, abandoned.

    Specific commands do specific things with these names

    As you've already seen, git checkout -b:

    Meanwhile, git reset --hard:

    The git merge command gets a bit complicated—not that git reset is any better, with its many different operational modes—but what it does begins with looking at the commit graph. You pick some commit—maybe by branch name, with the branch name identifying the tip commit—and git merge examines the commit graph, to determine how to work backwards from both your current commit and the commit you've named, so as to reach a common, shared commit.

    In this case, you have set things up like this:

    ...--F--G--H   <-- master
                \
                 I--J--...--N   <-- main (HEAD), enhancement
    

    and then run git merge enhancement. The name enhancement identifies commit N. So does the current branch name main. The shared commit that's on both branches is therefore commit N. Thus there is nothing to do.

    You can, right now, move the name main to point to commit H, using:

    git reset --hard master
    

    As before, the --hard will reset the index and work-tree, so be sure you have nothing you didn't already save; the end result is draw-able as:

    ...--F--G--H   <-- master, main (HEAD)
                \
                 I--J--...--N   <-- enhancement
    

    If you now run git merge enhancement, Git will walk backwards from H, and also walk backwards from N, to find the first shared commit. That's commit H itself, which is on all three branches. Git will then, if you allow it, decree that this kind of merge is too trivial to work hard at it, and simply move the name main so that it identifies commit N, and also git checkout commit N at the same time.

    Git calls this a fast-forward operation, and at the end of it, the picture is:

    ...--F--G--H   <-- master
                \
                 I--J--...--N   <-- main (HEAD), enhancement
    

    which is where you were just a moment ago!

    You can, however, force Git to make a true merge. Before we go there, let's look at a case where Git itself would have to do a true merge.

    True merge

    Suppose that instead of the above, you started with:

    ...--F--G--H   <-- master (HEAD)
    

    You then run:

    git checkout -b branch
    

    which adds a new branch name branch, identifying the current commit H, and attaches HEAD to it:

    ...--F--G--H   <-- master, branch (HEAD)
    

    Now you make a new commit I. I's parent will be H as usual. The current name, br1, is rewritten to point to new commit I, as the last step of this git commit operation:

    ...--F--G--H   <-- master
                \
                 I   <-- branch (HEAD)
    

    For fun (or to make the letter I'm going to want M, really), let's make a second commit that's only on branch, vs all those shared commits ending at H:

    ...--F--G--H   <-- master
                \
                 I--J   <-- branch (HEAD)
    

    Now let's git checkout master, which checks out commit H and re-attaches HEAD:

                 I--J   <-- branch
                /
    ...--F--G--H   <-- master (HEAD)
    

    and then make two new commits on master. For some inscrutable reason I'll draw them on a new top row (this is not actually necessary):

                 K--L   <-- master (HEAD)
                /
    ...--F--G--H
                \
                 I--J   <-- branch
    

    Now we run git merge branch (or git merge hash-of-J, which will do the same thing except for the commit log message). Git must start at L and work backwards, and start at J and work backwards, to find the best shared commit, namely H.

    This shared commit is called the merge base. Now that Git knows which commit is the merge base, it checks to see if the merge base is the current commit L and/or the other commit J, because those are the special easy cases. It's not, so this is not a trivial merge.

    The goal of a merge is to combine changes, but commits H, L, and J all just have snapshots. So Git first has to convert L and J into changes. By starting from the best common starting point—the merge base—both sets of changes should apply to that common starting point. To get the changes, Git in effect runs two git diff --find-renames commands:

    The merge process now tries to combine the changes, applying the combined changes to the snapshot from the merge base H. If all goes well—if the changes combine nicely—Git makes a new snapshot from the result. This new snapshot has two parents. The first one is the usual one. We're on master, which means commit L, so the first parent of the new merge is L. The other commit we named is J, so the second parent of the new merge is J. Having made the commit, Git then writes its new hash ID into the current branch name, as always.

    The final result, then—if all goes well—is:

                 K--L
                /    \
    ...--F--G--H      M   <-- master (HEAD)
                \    /
                 I--J   <-- branch
    

    and we have produced a true merge.

    Forcing a true merge with --no-ff

    Let's go back to this setup:

    ...--F--G--H   <-- master, main (HEAD)
                \
                 I--J--...--N   <-- enhancement
    

    and run git merge --no-ff enhancement. The --no-ff tells Git: Even if the merge base is H, do a true merge anyway.

    Git will now proceed to do the two diffs (internally). The first one compares H vs H. This, of course, produces an empty change-set. The second one compares H vs N. This, of course, produces all the changes required to turn the snapshot that's in H into the one that's in N.

    Git then combines the first change-set—the one that says "do nothing"—with the second. The result is just the second change-set. Applying this to the snapshot in H, Git gets the snapshot in N. The combining works fine, though, so now Git makes a new merge commit which we'll call O. O's first parent is H, and its second parent is N, and we have:

                 ,------------------O   <-- main (HEAD)
                /                  /
    ...--F--G--H   <-- master     /
                \                /
                 I--J----...----N   <-- enhancement
    

    Which should you use?

    Sometimes, fast-forwards are the way to go. Sometimes, real merges are the way to go. Let's look again at the graph of the actual Git repository for Git that I used earlier:

    ...--F--G--H   <-- master
           /
     ...--I   <-- sg/fsck-config-in-doc
    

    We can now take away the name sg/fsck-config-in-doc. All it is doing is giving us direct access to commit I, but we can get there from H. So let's remove the name:

    ...--F--G--H   <-- master
           /
     ...--I
    

    The idea here is that branch names don't matter (except in terms of finding the end commits). Only commits matter. A fast-forward operation typically makes two names identify the same commit. If you're planning to take away one or both names in the future, you'll never know that the fast-forward ever happened.

    Is this good or bad? Well, suppose you really want to know in the future that O, your merge that your forced with --no-ff, was a merge action. This implies that commits through N were some side work, and the side work was finally ready for prime time and added all at once. So then you do want a merge commit, even if git merge would default to doing a fast-forward.

    On the other hand, maybe you-in-the-future won't care that commits up through N were some sort of side-work. Side-work vs main-work is all irrelevant: it was just you, doing work. Maybe some of the specific commits before N don't matter either. Maybe you'd like to just have the commits all in a row, for easy viewing, or maybe you'd like to take this:

    ...--F--G--H   <-- master, main (HEAD)
                \
                 I--J--...--N   <-- enhancement
    

    and turn it into:

    ...--F--G--H--O--P   <-- main
    

    and just discard all of the I-through-N commits, keeping two new commits O and P that compactly represent "do first thing properly" and then "do second thing properly", rather than the rambling stuff you did in I, J, K, etc. In that case, you don't want a merge at all.

    What you-in-the-future will have is a commit graph with some designated tip commits. That's all you will have. You-today get to determine the set of commits in the commit graph; you-tomorrow can decide which names to keep; and future-you will bless or curse today-you and tomorrow-you based on how good a job you did in this planning ... or maybe, not care at all. It's up to you to decide how careful you want to be in planning your commits.

    But in any case, the names only matter for finding the commits. It's the commits and their resulting graph that really matter.