gitgit-commitgit-notes

Get all notes for a commit including their author and committer


Is there an easy way to retrieve all notes for a certain commit including information about the note author and the note committer?

Using git show --notes=refs/notes/* <commit_hash> I was able to get all notes. However I did not find out how to get the author and committer of the note using plumbing commands.


Solution

  • While git notes actually works by committing something, the author and/or committer name attached to a git notes commit is not considered useful and hence not made available. There are some good reasons for this.

    Let's take a quick look at a repository that has notes. Here, I am going to use the freebsd repository on GitHub, as it has refs/notes/commits (which for $reasons I have mapped to refs/notes/origin/commits). At the moment, we have:

    $ git rev-parse refs/notes/origin/commits
    af51c6d65d574faa11ab8026398e045e5f584040
    $ git show af51c6d65d574faa11ab8026398e045e5f584040 | sed 's/@/ /'
    commit af51c6d65d574faa11ab8026398e045e5f584040
    Author: hselasky <hselasky FreeBSD.org>
    Date:   Mon Aug 14 12:59:14 2017 +0000
    
        Adding Git note for current refs/heads/stable/10
    
    diff --git a/13/06/ece6bde0e4291eaf08139085990e5f55a622 b/13/06/ece6bde0e4291eaf08139085990e5f55a622
    new file mode 100644
    index 000000000000..3d1ba58d02fa
    --- /dev/null
    +++ b/13/06/ece6bde0e4291eaf08139085990e5f55a622
     @ -0,0 +1 @@
    +svn path=/stable/10/; revision=322500
    

    Look at what is going on in this commit: it adds a file named 13/06/ece6bde0e4291eaf08139085990e5f55a622. This file contains the note for the object whose ID is 1306ece6bde0e4291eaf08139085990e5f55a622:

    $ git show --decorate 1306ece6bde0e4291eaf08139085990e5f55a622 | sed 's/@/ /'
    commit 1306ece6bde0e4291eaf08139085990e5f55a622 (origin/stable/10)
    Author: hselasky <hselasky FreeBSD.org>
    Date:   Mon Aug 14 12:59:14 2017 +0000
    
        MFC r314878:
        Add support for constant pointer constructs to READ_ONCE() in the
        LinuxKPI. When the type of the argument is constant the temporary
        variable cannot be assigned after the barrier. Instead assign the
        temporary variable by initialization.
    
        Approved by:            re (kib)
        Sponsored by:           Mellanox Technologies
    
    Notes (origin/commits):
        svn path=/stable/10/; revision=322500
    
    [diff snipped]
    

    (I have this repository configured to use refs/notes/origin/commits so the note gets shown here.) In this case—which is typical—the author and committer of the note itself is the same as the author and committer of the commit to which the note attaches.

    If we look closer at object af51c6d65d574faa11ab8026398e045e5f584040, though, we see that it has many files with these odd names:

    $ git ls-tree af51c6d65d574faa11ab8026398e045e5f584040
    040000 tree 598b9e08b0138536da55f5ef55868b2a3a607194    00
    040000 tree 5101bb91ab93102057e242b41e19c55fdf3314e7    01
    040000 tree 2105ada31d9191d03b50a7ad5c97471c0b531283    02
    040000 tree 3e42537c937f0f36cff65b7b19570d2e301a17d2    03
    [and so on for 256 names]
    

    If we look at 598b9e08b0138536da55f5ef55868b2a3a607194, we find another 253 sub-trees named 00, 01, 02, .... The top one is:

    $ git ls-tree 598b9e08b0138536da55f5ef55868b2a3a607194 | head -1
    040000 tree 825029e67b3e99c8c9f36c68c26c57b7f4c2edb4    00
    

    and 825029e67b3e99c8c9f36c68c26c57b7f4c2edb4 itself has only 7 entries:

    $ git ls-tree 825029e67b3e99c8c9f36c68c26c57b7f4c2edb4
    100644 blob 47efcb375199433eaff1932ab03ff51ffbc0f4b2    067319a197c517553b4bd00eeca22fbbb7bb
    100644 blob a6aa47f5b27bc95afedb4178da68958d10c0665a    252549bd16445f7a9c45ff41b295a8bc653d
    100644 blob b892978bc9120fe06997841b48e6cc05027234db    a5b6354a4d39247d26563c2cf96ea644af63
    100644 blob b50a07b6e41e178bc374ab25e860a33560929801    b2ecb7b200786a0a17d90036510a2aa4fa86
    100644 blob 5725ea47215dd0697cb4510c53b63c49f6285a1b    d86e693087beab26f4d49d7e3eb86a611efb
    100644 blob 86754af5a4c9077db1e0bc52952823b3012278b4    e846eae3e7cb94a93496931c64d69682818f
    100644 blob 8798a0048bf97bce373e24f0fb0456544c127406    fe19d1123431f6cbad809f708ab744b4d02c
    

    The names on the left are file names: these are the files stored in the 00 sub-tree of the 00 tree of the commit. So this "top level" note commit, made by hselasky, contains hundreds of thousands of files (332,670 at the moment), all of whose names are these funky hash IDs, split up into directories of directories so that no one sub-directory has too many files.

    What git show does with a real commit, such as 1306ece6bde0e4291eaf08139085990e5f55a622 (the current tip of refs/heads/stable/10), is to look in commit af51c6d65d574faa11ab8026398e045e5f584040 to see if it has a file whose name is 1306ece6bde0e4291eaf08139085990e5f55a622, or whose directory name starts with 13. If it finds a directory, it checks within it to see if there is a file whose name starts with 06ece6bde0e4291eaf08139085990e5f55a622 or a directory whose name starts with 06. If it finds a directory, it peels off the next two characters, and so on. Eventually, it either finds the file, or it doesn't.

    If Git does find a file whose name matches the commit's hash, then that file contains the notes for that commit.

    If Git does not find such a file, then there is no note for that commit.

    Now, it is possible to update a note. Updating a note just means that we make a new commit whose contents are the same as the previous notes-commit except for one file. Let's say that we decide to update the note for commit 1306ece6bde0e4291eaf08139085990e5f55a622. We find that it's been put in 13/06/ec..., so we extract notes-commit af51c6d6... into a tree, edit the file 13/06/ec..., change the note, write the file, and write a new commit whose parent commit is af51c6d6.... We stuff the new hash for this new commit into refs/notes/origin/commits or refs/notes/commits ... and now we have replaced the note.

    Replacing the note like this is one way we could get a different author and committer for the note that goes with the original commit 1306ec....

    But let's take a look at an earlier commit. The next commit down on stable/10 is 8f2e6e2e028ef61fd105967432ff2838153110f7. We find its note by looking at the commit to which refs/notes/origin/commits points: that's af51c6... again. Does it have a directory named 8f? Why yes, it does. Does that directory have a sub-directory named 2e? Sure enough, it does. Does that have a file named 6e2e...?

    $ git rev-parse refs/notes/origin/commits:8f/2e/6e2e028ef61fd105967432ff2838153110f7
    46fa8873ffcf4c9e0d0270b02a3e2abcdf10e31e
    

    Indeed it does, and that's a blob that we can view:

    $ git cat-file -p 46fa8873ffcf4c9e0d0270b02a3e2abcdf10e31e
    svn path=/stable/10/; revision=322462
    

    so that's the note for commit 8f2e74c5.... But the author and committer of the place we looked—refs/notes/origin/commits aka af51c6...—is hselasky, while the author and committer of 8f2e6e2e028ef61fd105967432ff2838153110f7 is avos.

    To find the author and committer of whoever most recently changed the note for 8f2e6e2e028ef61fd105967432ff2838153110f7, we must start at the top level refs/notes/origin/commits commit and see who has touched a file whose name looks like that. The name itself is actually, currently, 8f/2e/6e2e...; but at some earlier point, the name would have been 8f/2e6e2e..., and as the repository continues to grow, the name will at some point suddenly change to 8f/2e/6e/2e.... So we would need a tool that is not only aware of the funky hash-ID-as-file-name thing, but also knows that the split, into directories containing sub-directories containing files, evolves over time. This makes it pretty hard to find when the note changed, if it ever did.

    If the note never changes, the first time it gets added to the repository is usually "when the commit itself is created" and therefore its author and committer will match the commit's author and committer. So that's usually good enough. If it's not good enough for your use case, you will have to write your own tools.