gitgithubgitlabgit-tagvisual-sourcesafe-2005

How to change the Tagger name and email of a Git Tag


Long story short I'm writing a script to migrate a very large project from (gasp) Microsoft SourceSafe to Git and I'm trying to retain the authors of the SourceSafe project's labels(which are essentially tags in Git). I know you can modify the author and committer name/date of a Git Commit but can you do the same to a Git Tag?


Solution

  • TL;DR

    Re-create the tags with the new desired data. But if anyone else had them before, they may not accept your new ones. Or they may! It's up to them, though.

    Description

    I know you can modify the author and committer name/date of a Git Commit

    Actually, you can't, and the fact that you can't (and what you can do instead) plays an important part in the rest of the answer.

    All Git objects have a hash ID as their "true name". The hash is formed by computing a cryptographic checksum of the object's contents. This means you can never change any Git object at all.1 What you can do is construct a new object, then convince everyone who had the old object to stop using it, and use instead the new object.

    This is what git commit --amend does (and what various interactive rebase options like edit and reword can do as well). First we extract the original Git object into ordinary data, where we can manipulate it; then we do the manipulation and ask Git to construct a new object; and finally we stop using the old object and start using the new one instead.

    For a commit that is the tip commit (see the definition of head in the gitglossary), this all goes pretty easily and smoothly, as long as we haven't pushed that commit yet. There are no additional commits referring back to this tip commit, so we make a new commit that is "just as good", re-direct the branch name (the head) to the new commit, and forget about the original we just replaced. It looks like we changed a commit, but we got a new hash ID instead.

    How this applies to tags

    Git has two kinds of tags, a lightweight tag and an annotated tag. The difference between these is that an annotated tag consists of a lightweight tag pointing to a tag object. It's the tag object that has the tagger information. (A lightweight tag has no such information of its own, it just points directly to the commit object.)

    Hence, to "change" a tag object, we must do the same thing we do to "change" a commit object: copy it to a new tag object.

    There is no built in command to do this, but it is easy to build one out of git cat-file -p—this lets you extract the original tag into ordinary data—and git mktag, which lets you turn ordinary data into a new tag object. For instance, the v2.2.1 tag in the Git repository for Git begins with:

    $ git cat-file -p v2.2.1
    object 9b7cbb315923e61bb0c4297c701089f30e116750
    type commit
    tag v2.2.1
    tagger Junio C Hamano <...
    

    The object line is the commit to which the tag points:

    $ git cat-file -t 9b7cbb315923e61bb0c4297c701089f30e116750
    commit
    

    so we can copy this tag to a new one with a different tagger:

    $ new_hash_id=$(git cat-file -p v2.2.1 | sed -e .... | git mktag)
    $ git update-ref refs/tags/$name $new_hash_id
    

    where the sed does whatever is necessary (see below) and $name is the name of the tag. Then we would make the lightweight tag v2.2.1 point to this new tag object in $new_hash_id. But there are two problems (only one of which is likely to apply to your case).

    Tags may be PGP-signed

    The above tag goes on to say:

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1
    

    and then has a PGP signature in it. This signature covers all the data except for the signature itself. If you copy-and-modify this tag, you should discard the original signature entirely (it will be invalid and will fail any testing applied); whether you can and should replace it with a new signature, and if so, whose, is up to you.

    Tags are not supposed to change their target objects

    The existing lightweight tag v2.2.1 currently points to the existing tag object:

    $ git rev-parse v2.2.1
    7c56b20857837de401f79db236651a1bd886fbbb
    

    This is the data we have been viewing up to this point.

    The new tag object will have some other, different hash ID. When we amended an unpublished commit, that was no big deal, because no one else had any idea that some branch name mapped to some particular hash ID.

    Tags, however, are pretty commonly "well known". In fact, the point of tags—particularly PGP-signed annotated tags, where the PGP signature lets you verify that no one has monkeyed with the tag data—is to guarantee that you can be sure that this tag is the right tag, and that the commit object to which it points is the original commit and not some Trojan Horse. If you change an existing tag, you're subverting this intent. Moreover, some people who know the previous tag's value will simply refuse to take a new value: you won't be able to get them to update an existing tag. As long as you're doing this before anyone else has the tag, though, they will never know, and you will be fine.


    1Or rather, you cannot change a Git object's contents unless you can break the hash. See also How does the newly found sha1 collision affect git?