gitlibgit2git-index

libgit2-newbie question: What goes in the file tree in a commit?


tl;dr: libgit2-newbie gets confused about indexes in coding commits, eventually decides to ignore the index and just update the commit tree. Please advise if this is wrong.

This is a (libgit2-newbie) question about git internals, using libgit2. I'm trying to write some code to stage a file and commit it so I can push it. For simplicity, assume that I have just cloned the repository, and there are several files in the repository. So the index on disk is empty. Then I read the index into my program, add my new file to the index and commit the change using the index (with the previous commit as parent). This means that there is only the one file in the tree that I pass to commit. However, this results in all my other files getting deleted.

What am I doing wrong?

Should the tree that I pass to commit contain all the files (including existing files) or just the new file? If all the files, where do I get the list from? I have looked at the examples in https://libgit2.org/docs/guides/101-samples/#commits but that does not answer my question.

My code is in Rust, so I will spare you the details unless someone wants it.

edit: ok, I have looked a bit further at the documentation. Is the index a red herring that I should ignore for this exercise? Is the solution to use git_treebuilder_new() (or rather repo.treebuilder(Some(old_tree)), for me), add my new file using the tree builder, build the new tree, and commit using the new tree?... that seems to be working


Solution

  • The index works like some kind of slate to build a "filesystem image", using tree objects for path information and blob objects for file data, which can be serialized into a bunch of files using hashes as their address. At least initially, every object will be stored as a single file under .git/objects/. After a while, the ODB will be packed (usually via git-gc), so some of them will be packed together, and it's only at that time that actual deltas get involved, to limit storage size to the latest version of a file plus all its "ancestry" binary deltas.

    Hence, you have to either start from another existing tree (usually by looking at HEAD, which keep track of what your working copy actually corresponds to, and "peeling" it to a tree), or via an empty index, in which case you'll only output the data you've added, and all other files from the previous commit will have disappeared (a.k.a. lots of "deletes").