gitgit-clonegit-fetchshallow-clone

Stop git from relating related history on fetch


I'm working on creating a template system with git, but git is being too clever (for my use case) by automatically 'unshallowing'. Let me try to explain.

For context, the templating process starts by copying template into a new_repo throwing away all history. Over time, after independent commits are made to both, new_repo will repeatably want to pull in changes from template. The issues arises when attempting to pull down template for a comparison.

Now to get into the technical aspects of how the process is implemented currently. After creating a shallow clone of the template, the new repository is made with only the latest commit, great. Shallow is desired because we don’t want history. Then I remove the template remote thinking doing so would “invalidate” the shallow status. Now after some time, I want to compare the template with the new repository. I need the full history of template to do a proper comparison. Using fetch to retrieve the full history of the template has an unexpected side effect of unshallowing the history of the new repository instead. I only wanted to pull down the commits of template without affecting the history of the new repository.

There seems to be some lingering link between the two histories that needs to severed. How does this link work and how can I stop git from unshallowing my history?

I've boiled it down for reproducibility:

$ git clone file://home/test/template --depth 1 new_repo
[...]
$ ls
new_repo/
template/
$ cd new_repo
$ git log main
commit ac787e… (grafted, HEAD -> main, origin/main, origin/HEAD)
Date:   Sun Jun 1 16:27:29 2025 -0700

    transient technology
$ git remote remove origin
$ git fetch ../template --shallow-since="2025-06-01 16:00:00 -0700"
[...]
$ git log main
commit ac787e… (HEAD -> main)
Date:   Sun Jun 1 16:27:29 2025 -0700

    transient technology

===== Result of git unshallowing history from template ===
|  commit 744f4f…
|  Date:   Sun Jun 1 16:27:29 2025 -0700
|
|      initial commit
======

Solution

  • TL;DR - as mentioned before: git is indeed going to be a bit too smart for what you are trying to do.

    Here's what your up against:
    Git is built as a directed-non-cyclic-graph:
    Directed: git keeps track of not only what the current state is, but where it came from before (even starting from null) and stores this info in every commit (e.g. demonstrated by git show @{^1} ~= show 1 commit before here). In your case this is why you will probably need to use --orphan
    non-cyclic: Git prevents loops in the cycle, using the information about direction means from any one ref (e.g. a named branch is a ref, so is a tag, and just means a pointer to a single commit) to any other ref, that git must only ever describe the path between them with a set of 0, or more, unique commits (e.g. including the refs them-selfs no loops allowed).
    graph: a collection type structure composed of nodes and edges.
    ---

    Regarding your curiosity about how "to compare the template with the new repository"
    so while creating orphan refs to trick git as suggested already, might work, you probably want the copy (not clone) option, as already mentioned, this will give you a clean history. You can then actually add the upstream remote anew (just take care you don't mark it as the default remote, perhaps call it upstream or something else)
    By adding the old upstream remote, to the fresh git repo, you can compare the local work-tree with something like git diff upstream/branch-ref without the need to ever checkout the upstream remote (just need to fetch the remote).

    # context
    # cd /home/test/template/
    # git checkout main
    
    # start with similar solution:
    git checkout --orphan new-main
    git commit -m "Start fresh"
    git cherry-pick main # or cherry-pick a few commits you care about
    # tip: or use an empty commit
    git branch -M new-main
    
    # context
    cd ..
    
    # ls ./
    # repo
    # template
    
    # clone is fine but you need to blow away the .git directory if you clone
    git clone file://home/test/template --depth 1 /home/test/new-repo
    # context
    cd /home/test/new-repo
    # blow-away just the git repo
    rm -vfRd /home/test/new-repo/.git/*  # this can be dangerous if you make a typo
    # however you get here, verify git is lost
    # 
    # git status
    # fatal: not a git repository (or any of the parent directories): .git
    #
    # ls /home/test/
    # new-repo
    # repo
    # template
    
    git init
    git add .
    git commit -m "Initial template commit"
    
    # now to compare with the template let's re-add the remote but NOT as default
    git remote add upstream file://home/test/template
    git fetch upstream
    git diff upstream/main
    # you can "git remove upstream" now if you want
    

    Other ideas to look-into (mentioned only for completeness):
    * Use a local bundle pack (see git bundle help for details)
    * Fresh clone, manually removing the .git and re-init the worktree instead of rsync (same copy idea different toolset)
    * Dangerous stuff like git reflog delete (not for the novice, but this is a starting point for re-writing git history for real, but please resist the temptation for your template use-case)

    To see how the link is working between the commits, see the git log with --graph option to show the git-graph or even better give this a try:

    git log --branches --tags --notes --graph --show-signature --all
    

    credit: this gist