gitbitbucketgit-gc

clear old commits from git and make space in .git


I'm trying to clear space in .git folder. I was version controlling the images on my site and maxed out my bitbucket space as my .git folder was growing to over 1gb. My solution was to move my images out of my local directory and into an s3 bucket.

I don't really care about the old commits so I was hoping to delete them and start fresh.

I followed this approach: how to delete all commit history in github?

As per the steps in the link:

Checkout git checkout --orphan latest_branch

Add all the files git add -A

Commit the changes git commit -am "commit message"

Delete the branch git branch -D main

Rename the current branch to main git branch -m main

Force update your repository git push -f origin main

Then Finally git gc -- aggressive -- prune=all

This worked to clear commits from my bitbucket account but the .git folder hasn't changed size it is still 1gb. I thought git gc would clear the old files but it doesn't seem to work.

At this point how can I clear old files from my git folder?


Solution

  • I take it that the real questions here are:

    1. Why didn't the thing I did reduce the size of the .git folder significantly?

    2. What should I have done?

    To discover the answer, let's talk about how commits die. To understand that, you need to know how commits can't die. Here's what you need to know:

    Now then. If that's how a commit can't die, how can a commit die? Well, for a commit to die, it would have to be the case that no branch or tag name points to a commit that calls this commit its parent at any depth. In other words, imagine a chain of commit parentage; well, a commit can only die if no chain of commit parentage starting with a branch or tag name reaches it.

    Okay, so now let's talk about what you did. You deleted main, in the hopes that this would cause all existing commits to die. But it seems like they didn't. So let me ask you this: was main the only branch you had? Because if it wasn't, then all those other branch names still exist, and they are keeping all their parents alive all the way back to the root commit.

    Moreover, you've got a remote. So you also have remote-tracking branches that mirror the state of that remote. So in addition to your main, you've also got an origin/main. What you did had no effect on that. So that origin/main is keeping all your commits alive, all the way back to the root commit.

    So what you did probably didn't cause any commits to die!

    And that would explain why your .git folder didn't get significantly smaller after what you did.

    So, to do it correctly the way you did it, you would need to say git branch to find out what branches you have, and delete each of those branches. Plus, you'd need to say git tag --list to find out what tags you have, and delete each of those tags.

    But even that wouldn't be enough, because what about all the remote-tracking branches you have? To see those, you'd say git branch --remote. You'd need to delete all those remote branches too! But you can't just delete them; you have to push the deletion. For every remote-tracking branch, you'd need to say git push --delete origin <branchname>.

    Sounds like a lot of work, doesn't it?

    And that's why the simplest way to do what you want to do is just to checkout the state of things that you like, which is presumably main, and then just throw the entire .git folder away. Then say git init and start all over with a fresh commit. Also throw your remote away and make a fresh remote. Hook the two together with git remote add, and push your brand new clean repository to the branch new clean remote.