very scary situation right now: I have used the GitLens
extension of VSCode
to jump back to an older commit. I wanted to checkout
the commit, located it in the COMMITS
sidebar, right clicked and selected Switch to Commit...
. I did expect to checkout to that commit, and then be able to check out back to my current state.
Now running git log
shows me the log of my commits only up to the point of the commit that I have selected. This is scary. Where are my newer commits?
As it is now I cannot locate my newer commits and go back to them. I have made a new commit just before switching to the older commit, so I am 100% certain there should be newer commits.
This is a new project that I have not committed to any remote location yet, so git pull
cannot bail me out.
I really hope someone can help me, I do not want to loose 2 days of work...
This is scary, to those new to Git. But don't worry: all the commits are still there.
Various GUIs, including Visual Studio, block access to Git (which could be good or bad, depending on your point of view) so that you can't see what's really going on, and I don't use these GUIs, because they keep you from seeing what's going on, so I can't say what, precisely, each clicky button in your GUI does. Git, however, works like this:
There is, at all times,1 a current commit. Git has a special name for this commit: HEAD
, written in all uppercase just like this.2
At most times, there is also a current branch. Git has a special name by which you can access this current branch: HEAD
.
You might—in fact, you should—object at this point: how do we know whether HEAD
refers to the commit or to the branch name? Git's answer is: I pick one or the other based on whichever one I want at the moment. Some things need a branch name, in which case, HEAD
turns into the branch name. Some things need a commit, in which case HEAD
turns into the commit. Basically there are two internal ways Git has to ask what's the HEAD now. One gives a branch-name answer, like master
or main
or whatever, and the other gives you a raw commit hash ID.
OK, so, with this in mind, we now remember that git log
prints out the log like this:
commit eb27b338a3e71c7c4079fbac8aeae3f8fbb5c687 (...)
Author: ...
...
commit fe3fec53a63a1c186452f61b0e55ac2837bf18a1
...
That is, we see all these weird hash IDs spill out, one at a time. The hash IDs are the actual, true-names of each commit. Each commit gets a globally-unique hash ID: no two different commits are ever allowed to have the same one. That's why the hash IDs are so big and ugly. They look random. They aren't actually random, but they are unpredictable.3
A branch name like main
translates to a commit hash ID. A raw hash ID already is a hash ID. Either way, given the right hash ID, Git can find the commit.
Each commit holds a full snapshot of every file,4 plus some metadata: information about the commit itself, such as who made it, and when, and a log message they can write at the time. Crucially for Git itself, one item in this metadata is the raw hash ID of the previous commit.
There's one other random fact about commits that is useful to remember here: Once made, no part of any commit can ever be changed. That's how the hash IDs actually work, and it's critical to Git being a distributed version control system. But it also means that no Git commit can ever contain the raw hash ID of its future children commits, because we have no idea what those will be when we create the commit. Commits can store the "names" (hash IDs) of their parents, because we do know their ancestry when we create the children.
What this means for us here is that the commits remember their parents, which forms a sort of backwards-looking chain. All we have to do is remember the raw hash ID of the latest commit. When we do that, we end up with a chain that we can draw like this:
... <-F <-G <-H <--main
Here, the name main
holds the real hash ID of the latest commit, which for drawing purposes, we just call H
. Commit H
in turn holds the hash ID of earlier commit G
, which holds the hash ID of still-earlier commit F
, and so on.
We can now see how git log
works: it starts with the current commit, H
, as selected by the current branch, main
. To make main
be the current branch, we attach the special name HEAD
to the name main
:
...--F--G--H <-- main (HEAD)
Git uses HEAD
to find main
, uses main
to find H
, and shows us H
. Then Git uses H
to find G
and shows us G
; then it uses G
to find F
, and so on.
When we want to look at any historical commit, we pick it out, by hash ID, and tell Git: attach HEAD
directly to that commit. We can draw that like this:
...--F <-- HEAD
\
G--H <-- main
When we run git log
now, Git translates HEAD
to a hash ID—which it finds directly this time; there's no attached branch name—and shows us commit F
. Then git log
moves on from there, backwards. Where are commits G
and H
? They are nowhere to be seen!
But it's OK: if we run git log main
, git log
starts with the name main
, rather than with the name HEAD
. That finds commit H
, which git log
shows; then git log
moves to G
, and so on. Or, we can even run:
git log --branches
or:
git log --all
to have git log
find all branches or all refs ("refs" include branches and tags, but also other kinds of names).
(This brings up another, separate can-of-worms, which is all about how git log
handles the case of "wanting" to show more than one commit "at the same time". I won't go there at all, in this answer.)
This "viewing a historical commit" mode, in Git, is called detached HEAD mode. That's because the special name HEAD
is no longer attached to a branch name. To re-attach your HEAD
, you simply choose a branch name, with git checkout
or (Git 2.23 or later) git switch
:
git switch main
for instance. You've now checked out the commit that the branch name main
selects, and HEAD
is now re-attached to the name main
.
Before we stop, there's one more really important thing to learn, which is: how branches grow. But let me get footnotes out of the way first.
1There's an exception to this rule, necessary in a new, totally empty repository that has no commits at all. That exception can be used in a weird way later, in a non-empty repository. You won't be making use of this though.
2The lowercase variant, head
, often "works" on Windows and macOS (but not on Linux and others). However, this is deceptive, because if you start using the git worktree
feature, head
(lowercase) doesn't work correctly—it gets you the wrong commit sometimes!—while HEAD
(uppercase) does. If you don't like typing in all-caps, consider using the shorthand @
character, which you can use instead of HEAD
.
3Git uses cryptographic hashing here: the same kind of stuff one finds in cryptocurrencies, though not as strict (Git currently still uses SHA-1, which is already outdated in cryptographic terms).
4The snapshots are stored in a special, read-only, Git-only, compressed and de-duplicated format. Git shows commits as "changes since previous commit" but stores commits as snapshots.
Suppose we have the following situation:
...--G--H <-- main (HEAD)
We now want to make a new commit, but we'd like to put it on a new branch. So we first as Git to make a new branch name, and point that name to commit H
too:
git branch develop
which results in:
...--G--H <-- develop, main (HEAD)
Now we pick develop
as the name to have HEAD
attached-to, with git checkout
or git switch
:
...--G--H <-- develop (HEAD), main
Note that we're still using commit H
. We're just using it through the other name now. The commits up through and including H
are on both branches.
We now make a new commit, the usual way we do in Git. Once we're ready, we run git commit
and give Git a log message to put in the metadata for the new commit. Git now:
I
—will point backwards to existing commit H
;user.name
and user.email
as the author and committer of this new commit, using "now" as the date-and-time;H
, and in part from the snapshot we've saved: everything that is in the new commit goes into making up the new random-looking hash ID, which is why we can't predict it.)So now we have this new commit I
, pointing back to existing commit H
:
...--G--H
\
I
Now Git does the other bit of magic that makes it all work: git commit
writes I
's hash ID into the current branch name. That is, Git uses HEAD
to find the name of the current branch, and updates the hash ID stored in that branch name. So our picture is now:
...--G--H <-- main
\
I <-- develop (HEAD)
The name HEAD
is still attached to the branch name develop
, but the branch name develop
now selects commit I
, not commit H
.
It's commit I
that leads back to commit H
. The name just lets us find the commit. The commits are what really matter: branch names are just there to let us find the last commit. Whatever hash ID is in that branch name, Git says that that commit is the last commit on that branch. So since main
says H
right now, H
is the last commit on main
; since develop
says I
right now, I
is the last commit on develop
. Commits up through H
are still on both branches, but I
is only on develop
.
Later, if we like, we can have Git move the name main
. Once we move main
to I
:
...--G--H--I <-- develop, main
then all commits are once again on both branches. (I left out HEAD
this time because we might not care which branch we are "on", if both select I
. In fact, we can delete either name—but not both—because both names select the same commit and that's all we need to find the right hash ID. If we were to write this hash ID down somewhere, we might not need any name. But that would be ... yucky, at best. We have a computer; let's have it save the big ugly hash IDs for us, in nice neat names.)