git

A commit in Git: Is it a snapshot/state/image or is it a change/diff/patch/delta?


When learning git, this is very confusing.

So what is an appropriate mental model for a Git commit?


Solution

  • It depends!

    Short answer: both.

    Medium answer: It depends.

    Long answer: Git is a bit like with particles vs. waves in quantum phenomena: Neither of the two views alone can explain all observations. Read on.

    Internally, Git will use both representations, depending (conceptually) on which one it deems more efficient in terms of storage space and execution time for a given commit at a certain time. The snapshot representation is the primary one. However, for understanding the meaning of git commands, this is completely irrelevant, because:

    From the user's point of view, it depends on what you do:

    Confusion 1: Commit as a snapshot vs. commit as a change

    Indeed some commands simply only make any sense at all when you think about commits as snapshots of the working tree. This is most pronounced for checkout, but is also true for stash and at least halfway for fetch and reset.

    For other commands, madness is the likely result when you try to think of commits in this manner. For those other commands, commits are clearly treated as changes,

    For instance the common confusion between revert vs. reset goes away once you understand that revert is about changes, but reset is about snapshots.

    Confusion 2: Commit as a fixed thing vs. commit as something fluid

    There is a side-effect of the above that can shock Git newbies accustomed to other versioning systems. It is the fact that Git appears to not even commit itself to its commits.

    Huh?

    Assume you have created a branch X containing what you like to think of as your commits A and B. But main has progressed a little, so you rebase X to main.

    When you think of A and B as changes, but of main as a snapshot (hey, both commit models occur in a single operation!), this is not a problem: Just apply the changes A and B to the snapshot main.

    This thinking is so natural that you will barely notice that Git has now rewritten your commits A and B: They now have different snapshot content and hence a different SHA-1 ID. In Git, the conceptual commit that you think of as a developer is not a fixed-for-all-times kind of thing, but rather some fluid object that changes as a result of working with your repository.

    In contrast, if you think of all three (A, B, and main) as snapshots or of all three as changes, your brain will hurt and you will get nowhere.

    Disclaimer

    The above is a much-simplified description. In Git reality,

    And don't get confused by the fact that the Pro Git book's very first characterization of Git (in section "Git Basics") is "Snapshots, Not Differences".

    Git is complicated after all.