Clearing Git History in Local and Remote Branches

Git tracks changes as commits. This makes it possible and convenient to check the code change history in a repository. However, this also has side effect. The history consumes storage and the whole repository including the current version of the code files and all the previous changes may be quite large. This is especially a problem if the history contains some commits that added very large files by mistakes. Under some cases, we may want to clear git history for some repositories and keep only the latest version of the code.

Git supports a wide range of operations in the repository. With combinations of the Git operations, we can clear the history of a repository’s branches. Here, we clear the history of the git master branch. If we would like to clear all histories of all branches, we may do the similar steps on the other branches.

Clear Git master branch history in the Git server

To clear the history of the master branch, we can do the operations of:

  • creating a “clean” temporary branch
  • add all files into the temporary branch and commit
  • delete the current master branch
  • rename the temporary branch to be the master branch
  • force push the master branch to the Git server

Because the new master branch has only one commit, after the master branch is force pushed to the Git server, the master branch’s history is “cleared”.

The commands are as follows, operating in a cloned repository.

git checkout --orphan tmp-master # create a temporary branch
git add -A  # Add all files and commit them
git commit -m 'Add files'
git branch -D master # Deletes the master branch
git branch -m master # Rename the current branch to master
git push -f origin master # Force push master branch to Git server

Book keeping in the operating Git repository clone

The repository we are operating has the new branch metadata and some garbage files. We can add the remote master branch tracking and do a garbage collection in the operating repository as follows.

git branch --set-upstream-to=origin/master master # Local master tracks origin/master
git gc --aggressive --prune=all # remove the old files

Reset other existing Git repository clones after the branch is cleared

One consequence of clearing (or changing) history of a Git repository is that we have to forcefully update other cloned repositories which contain the old Git history. We can make Git forcefully resetting the cloned repositories too.

In any cloned repositories, to do a “forceful” pull, run the following commands.

git fetch --all
git reset --hard origin/master

4 comments:

Leave a Reply

Your email address will not be published. Required fields are marked *