gitgit-checkoutgit-resetgit-clean

What is the difference between "git checkout -- ." and "git reset HEAD --hard"?


This is not a general question about what '--' does, as in the marked duplicate. This is a git-specific question asking for clarity on what the operational differences are between the mentioned commands.

If I want to clean out my current directory without stashing or committing, I usually use these commands:

git reset HEAD --hard
git clean -fd

A co-worker also mentioned using this command:

git checkout -- .

It's a difficult command to google, and it's not clear to me from the git documentation what this command actually does. It seems to be one of the later-mentioned usages in the manual.

At a guess it replicates git reset HEAD --hard, but what exactly does it do as compared to the commands I'm already using?
Does it replicate one or both of the commands, or is it similar yet subtly different?


Solution

  • First, let's just address the double hyphen or double dash, to get it out of the way (especially since this question no longer has a marked duplicate).

    Git mostly uses this in the POSIX-approved fashion (see Guideline 10), to indicate a dividing line between option arguments and non-option arguments. Since git checkout accepts branch names, as in git checkout master, and also file (path) names, as in git checkout README.txt, you can use the -- to force Git to interpret whatever comes after the -- as a file name, even if it would otherwise be a valid branch name. That is, if you have both a branch and a file named master:

    git checkout master
    

    will check out the branch, but:

    git checkout -- master
    

    will check out the file (confusingly, from the current index).

    Branches, index, and files, oh my

    Next, we need to address a quirk of git checkout. As one can see from the documentation, there are many "modes" of git checkout (the documentation lists six separate invocations in the synposis!). There are various rants (of varying quality: Steve Bennet's is actually useful, in my opinion, though naturally I do not agree with it 100% :-) ) about Git's poor "user experience" model, including the fact that git checkout has too many modes of operation.

    In particular, you can git checkout a branch (to switch branches), or git checkout one or more files. The latter extracts the files from a particular commit, or from the index. When Git extracts files from a commit, it first copies them to the index, and then copies them from the index, to the work-tree.

    There is an underlying implementation reason for this sequence, but the fact that it shows through at all is a key element. We need to know a lot about Git's index, because both git checkout and git reset use it, and sometimes in different ways.

    It's a good idea, I think, to draw a three-way diagram or table illustrating the current—or HEAD—commit, the index, and the work-tree. Suppose that:

    Each entity—the HEAD commit, the index, and the work-tree—holds three files right now, but each holds a different set of files. The table of the entire state then looks like this:

      HEAD       index    work-tree
    -------------------------------
    README.md  README.md  README.md
    file.txt   file.txt   file.txt
               new.txt    new.txt
    rmd.txt
                          untr.txt
    

    There are many more possible states than just these: in fact, for each file-name, there are seven possible combinations of "in/not-in" HEAD, index, and work-tree (the eighth combination is "not in all three", in which case, what file are we even talking about in the first place?!).

    The checkout and reset commands

    The two commands you're asking about, git checkout and git reset, are both able to do many things. The specific invocations of each, however, reduce the "things done" to one of two, to which I will add several more:

    These overlap a lot, but there are several crucially-different parts.

    Let's consider the file named new.txt above in particular. It's in the index right now, so if we copy from the index, to the work-tree, we replace the work-tree copy with the index copy. This is what git checkout -- new.txt does, for instance.

    If, instead, we start by copying from HEAD to the index, nothing happens to new.txt in the index: new.txt doesn't exist in HEAD. Hence an explicit git checkout HEAD -- new.txt just fails, while a git checkout HEAD -- . copies the files that are in HEAD and leaves the two existing new.txt versions undisturbed.

    The file rmd.txt is gone from the index, so if we git checkout -- ., Git does not see it and does nothing about it. But if we git checkout HEAD -- ., Git copies rmd.txt from HEAD into the index (now it's back) and then from the index to the work-tree (and now it's back there, too).

    The git reset command has a key difference when used with no path name arguments. Here, it literally re-sets the index to match the commit. That means that for new.txt, it notices that the file is not in HEAD, so it removes the index entry. If used with --hard, it therefore also removes the work-tree entry. Meanwhile rmd.txt is in HEAD, so it copies that back to the index, and with --hard, to the work-tree as well.

    If there are unstaged, i.e., work-tree only, changes to the other two files README.md and file.txt, both forms of git checkout and the --hard form of git reset wipe out those changes.

    If there are staged changes to those files—changes that have been copied into the index—then git reset un-stages them. So does the variant of git checkout where you give it the name HEAD. However, the variant of git checkout where you copy the index files back to the work-tree keeps those staged changes staged!

    Top level vs current directory

    Last, it's worth noting that ., meaning the current directory, may at any time be different from "top of Git repository":

    $ git rev-parse --show-toplevel
    /home/torek/src/kernel.org/git
    $ pwd
    /home/torek/src/kernel.org/git/Documentation
    $ git rev-parse --show-cdup
    ../
    

    Here, I am in the Documentation sub-directory of the top level directory git, so . means everything in Documentation and its subdirectories. Using git checkout -- . will check out (from the index) all the Documentation and Documentation/RelNotes files, but not any of the ../builtin files, for instance. But git reset, when used without path names, will reset all entries, including those for .. and ../builtin.