I have a git repository containing files which have some sensitive data possibly hardcoded, or formally hardcoded and now residing at some points in the git history.
In the interest of making the project publicly available so programers with similar interests can benefit form it and contribute changes back, I want to fork it an sanitize the offending files.
The procedure I considered was as follows:
public-master
public-master
public-master
git reflog expire --expire-unreachable=now --all && git gc --prune=all --agressive
remove all unreachable refs, which is now any obj not in the public branchgit push
add the public master back upstream into the private repository.master
. Push to origin.Is this sufficient to sanitize my repo, or would it be possible to recover sensitive data after this. Is there a more sensible and common way to resolve this problem? Are any of the steps extranious?
For example can I do this all in one repository, or does the nature of git-packs mean I might still push an obj
that contains sensitive information?
The only problem is I want to be able to pull from the private repo, and then they would have unshared history.
That seems unavoidable, since you have change the branch history and squash it.
Instead of pulling from the new public repo, I would simply consider changes done one the new repo clone and decide which one I want to add to the local clone of the old private repo:
# update local content of new repo
cd /path/to/public/repo
git pull
# check what needs to be added
cd /path/to/clone/of/old/repo
git --work-tree=/path/to/public/repo add -p .
You will see the diffs between old and new, coming from possible new evolution done on the public repo.