gitrevision-history

How to remove history information from other repositories


Before, 2015 we had one big SVN repository, then someone think it will be good to move or copy some projects into a smaller repository. Now we start to go from SVN to GIT and on converting the repository, and the history in GIT starts on the day when the projects are moved or copied.

Because we need also the old history, I searched and found that it would be possible to merge the history of the new repository with the history of the old repository.

For this, I use git replace to replace the first history entry in the new repository with an entry from short before of the history from the big repository, that was the source for the new history.

This works, and I only lost the first entry of the new repository and the last before the project was moved from the old history. But now I have the history for all projects that ever were in the big repository, and the GIT repository is now very big.

Is there any way to delete history and all-around of projects that will not be in this repository?


Solution

  • Yes. Usually removing things from a repository is very problematic, but since you just started using it (so there are probably few clones or references to the current commits around) that is probably your best option.

    You could use rebase --interactive and edit the history the way you want, but actually since you just started using that repository you should probably take the chance to take out that replace (which is a potentially problematic hack) as well.
    So you should probably start afresh re-converting the repository, taking only (and all) the commits you want.

    If you are not familiar with this stuff you'd better find someone who is to make the conversion. You're already going to have your standard mistakes in your process of learning git, if you also start with a messed-up repository you're looking for a hellish experience in the coming months and years.


    If on the other hand you've already done significant new work on the git repository, and you don't want to change that history, all right, keep it and re-convert only the old history.
    Then delete the previous replace and add a new one to link the re-converted history.
    Note that you don't need to lose any commit when using replace. Look better into it, or ask someone more experienced to do it. I don't want to make an extensive step-by-step explanation here, but you basically have to replace the first commit of the more recent part of the history with an identical copy of it except with its parent edited to be the last (most recent) commit of the older history (this after you imported the objects of the older history in the newer repository, of course).