gitgithubstaginggit-resetgit-rm

What's the difference between 'git rm --cached', 'git restore --staged', and 'git reset'


I have come across the following three ways in order to unstage the files that were staged by the command 'git add'

git rm --cached <file>
git restore --staged <file>
git reset <file>

Their behaviors looked completely same when I ran those commands one by one. What exactly are the differences between them?


Solution

  • Two are the same; one is not, except under particular circumstances.

    To understand this, remember that:

    So the index / staging-area contains, at all times, your proposed next commit, and was initially seeded from your current commit when you did a git checkout or git switch to obtain that commit.1 Your working tree thus contains a third copy2 of each file, with the first two copies being the one in the current commit aka HEAD, and the one in the index.

    With that in mind, here's what each of your commands does:

    (Note that git restore, unlike this particular form of git reset, can overwrite the working tree copy of some file, if you ask it to do so. The --staged option, without the --worktree option, directs it to write only to the index.)

    Side note: many people initially think that the index / staging-area contains only changes, or only changed files. This is not the case, but if you were thinking of it this way, git rm --cached would appear to be the same as the other two. Since that's not how the index works, it's not.


    1There are some quirky edge cases when you stage something, then do a new git checkout. Essentially, if it's possible to keep a different staged copy in place, Git will do so. For the gory details see Checkout another branch when there are uncommitted changes on the current branch.

    2The committed copy, and any staged copy, are actually kept in the form of an internal Git blob object, which de-duplicates contents. So if these two match, they literally just share one underlying copy. If the staged copy differs from the HEAD copy, but matches any—perhaps even many—other existing committed copy or copies, the staged copy shares the underlying storage with all those other commits. So calling each one a "copy" is overkill. But as a mental model, it works well enough: none can ever be overwritten; a new git add will make a new blob object if needed, and if nobody uses some blob object in the end, Git eventually discards it.


    A specific example

    In a comment, pavel_orekhov says:

    It is still not clear to me where "git rm --cached" and "git restore --staged" differ. Could you please show a series of commands with these 2 that exhibit different behavior?

    Let's check out a specific commit in the Git repository for Git itself (clone it first if needed, e.g., from https://github.com/git/git.git):

    $ git switch --detach v2.35.1
    HEAD is now at 4c53a8c20f Git 2.35.1
    

    Your working tree will contain files named Makefile, README.md, git.c, and so on.

    Let's now modify some existing file in the working tree:

    $ ed Makefile << end
    > 1a
    > foo
    > .
    > w
    > q
    > end
    107604
    107608
    $ git status --short
     M Makefile
    

    The > signs are from the shell asking for input; the two numbers are the byte counts of the file Makefile. Note the output from git status is SPACEMSPACEMakefile, indicating that the index or staging area copy of Makefile matches the HEAD copy of Makefile, while the working tree copy of Makefile differs from the index copy of Makefile.

    (Aside: I accidentally added two foo lines while preparing the cut and paste text. I'm not going to go back and fix it, but if you do this experiment yourself, expect slightly different outputs.)

    Let's now git add this updated file, then replace foo in the first line with bar:

    $ git add Makefile
    $ git status --short
    M  Makefile
    

    Note that the M has moved left one column, M-space-space-Makefile, indicating that the index copy of Makefile differs from the HEAD copy, but now the index and working tree copies match. Now we do the foo-to-bar replacement:

    $ ed Makefile << end
    > 1s/foo/bar/
    > w
    > q
    > end
    107608
    107608
    $ git status --short
    MM Makefile
    

    We now have two Ms: the HEAD copy of Makefile differs from the index copy of Makefile, which differs from the working tree copy of Makefile. Running git diff --cached and git diff will show you exactly how each pairing compares.

    $ git diff --cached
    diff --git a/Makefile b/Makefile
    index 5580859afd..8b8fc5a6d6 100644
    --- a/Makefile
    +++ b/Makefile
    @@ -1,4 +1,5 @@
    -# The default target of this Makefile is...
    +foo
    +foo
     all::
     
     # Define V=1 to have a more verbose compile.
    $ git diff
    diff --git a/Makefile b/Makefile
    index 8b8fc5a6d6..96a787d50d 100644
    --- a/Makefile
    +++ b/Makefile
    @@ -1,4 +1,4 @@
    -foo
    +bar
     foo
     all::
     
    

    Now, if we run git rm --cached Makefile, this will remove the index copy of the file Makefile entirely, and git status will change accordingly. Because we have all these modifications going around Git demands the "force" flag as well:

    $ git rm --cached Makefile
    error: the following file has staged content different from both the
    file and the HEAD:
        Makefile
    (use -f to force removal)
    $ git rm --cached -f Makefile
    rm 'Makefile'
    $ git status --short
    D  Makefile
    ?? Makefile
    

    We now have no file named Makefile in our proposed next commit in the index / staging-area. However, the file Makefile still appears (with the first line reading bar) in the working tree (inspect the file yourself to see). This Makefile is an untracked file so we get two output lines from git status --short, one to announce the impending demise of file Makefile in the next commit, and the other to announce the existence of the untracked file Makefile.

    Without making any commit, we now use git restore --staged Makefile:

    $ git restore --staged Makefile
    $ git status --short
     M Makefile
    

    The status is now space-M again, indicating that Makefile exists in the index (and therefore will be in the next commit), and furthermore, matches the HEAD copy of Makefile, so git diff --staged—which is another way to spell git diff --cached—will not show it (and indeed will show nothing). The working tree copy remains undisturbed, and still contains the extra line bar, as git diff shows:

    $ git diff --staged
    $ git diff
    diff --git a/Makefile b/Makefile
    index 5580859afd..96a787d50d 100644
    --- a/Makefile
    +++ b/Makefile
    @@ -1,4 +1,5 @@
    -# The default target of this Makefile is...
    +bar
    +foo
     all::
     
     # Define V=1 to have a more verbose compile.
    

    Again, the key to understanding all of this is:

    So if and only if removing the index copy entirely puts things back the way they were (which can happen when some file is new), then "make the index copy match the nonexistent HEAD copy, by removing it" is a correct way to do what you want. But if the HEAD commit contains a copy of the file in question, git rm --cached the-file is wrong.


    3Note that --cached and --staged have the same meaning for git diff. For git rm, however, there's simply no --staged option at all. Why? That's a question for the Git developers, but we can note that historically, in the distant past, git diff did not have --staged either. My best guess is therefore that it was an oversight: when whoever added --staged to git diff did it, they forgot to add --staged to git rm too.