mercurialcvscvs2svn

Is cvs2hg still potentially producing corrupted repositories?


Trying to migrate a repository from cvs to hg, I found the tool cvs2hg, and it seems to do nicely he job (conversion goes fine, and I have all the tags and branches). However, the hg documentation warns about "fixup commits" making the repository somewhat corrupted or at least dangerous.

Is this still a problem ? Maybe hg or cvs2hg have benefited from fixes since this warning was written. If it is, potentially, how can I check if I am in such a dangerous situation, on the resulting hg repository ?


Solution

  • Fixup commits are good and necessary. And cvs2hg does much better job than hg convert.

    But maybe first about the problem. In CVS repository you can play various dirty tricks with tags and branches. For example, you can manually fine-tune some tag tagging today's version of 3 files, yesterday's version of 4 others, and month-long version of yet another. In practice, I did it a lot of times to make "patch tags" (there is some old tag, I have various commits afterwards, there turns out to be a bug, I fix the bug, make fixup tag by old tag, moving it on 1-2 files).

    In the result, you get tag which points to release which naver has existed or will exist at any point of repository history, if the history is taken for whole repo.

    Similar tricks could be made with branches. Or branches can start from "ugly" tag.

    Any kind of „natural” conversion of CVS to HG is dead lost on such cases. There is no place in the time-based history at which such tag or branch could be hooked. And hg convert just binds such tags at more-or-less random places, and branches at very ugly places.

    Fixup commits simply are those missing revisions: artificial commits which are bound at appropriate place and introduce changes which put repository at state at which it should be at given tag. With those, we get both "artificial" tags, and branches, properly bound to proper code.

    So if you:

    then hg convert based history will have 4 edit changesets (just like those above) and blah_1.0 bound at some ugly place with wrong content. At the same time, cvs2hg will create "fixup commit" which will artificially create changeset at which we really have a.c(1.1), b.c(1.1) and c.c(1.2), and tag there. In a history, such changeset is reasonably similar to transplanted/grafted/cherry-picked commit.