gitgit-submodulesshallow-clone

Shallow git submodule referencing old commit clones remote's current HEAD


This is an indirect continuation of Why does git clone --depth 1 leave packfiles?

I am trying to clone raspberrypi/debugprobe which contains ARM-software/CMSIS_5 as a submodule, and I would like to only download the submodule at the commit referenced by the superproject with no additional overhead.

Cloning the superproject and shallowly updating its submodules results in downloading 2 commits per submodule, rather than the expected 1:

git clone https://github.com/raspberrypi/debugprobe.git
cd debugprobe/
GIT_LFS_SKIP_SMUDGE=1 git submodule update --init --depth 1
cd CMSIS_5/
git log --all --graph --oneline
cd ../freertos/
git log --all --graph --oneline
cd ..
* c12fabc (grafted, origin/develop, origin/HEAD, develop) Update README.md (#1669)
* a65b7c9 (grafted, HEAD) Bump version and docs for release.
* 5588ae6 (grafted, origin/main, origin/HEAD, main) Update ARM_CRx_No_GIC port (#1101)
* 2dfdfc4 (grafted, HEAD) Add Cortex M7 r0p1 Errata 837070 workaround to CM4_MPU ports (#513)

Is this the intended behaviour of git submodule update --depth 1? Is there a way to avoid downloading the remote's current HEAD with a single submodule command?

The only workaround I have written so far is:

git clone https://github.com/raspberrypi/debugprobe.git
cd debugprobe/
git submodule init
git submodule status --cached
cd CMSIS_5/
git init
GIT_LFS_SKIP_SMUDGE=1 git pull --depth 1 https://github.com/ARM-software/CMSIS_5.git a65b7c9a3e6502127fdb80eb288d8cbdf251a6f4
git log --all --graph --oneline
cd ../freertos/
git init
git pull --depth 1 https://github.com/FreeRTOS/FreeRTOS-Kernel.git 2dfdfc4ba4d8bb487c8ea6b5428d7d742ce162b8
git log --all --graph --oneline
cd ..
* a65b7c9 (grafted, HEAD -> main) Bump version and docs for release.
* 2dfdfc4 (grafted, HEAD -> main) Add Cortex M7 r0p1 Errata 837070 workaround to CM4_MPU ports (#513)

Eager to see if anyone is able to provide a less verbose solution.
(Or an explanation as to why git submodule update --depth 1 clones the remote's current HEAD)


Solution

  • After putting the below script in your path as git-minimal-submodule-update-init-recursive you can e.g.

    GIT_LFS_SKIP_SMUDGE=1 git minimal-submodule-update-init-recursive
    

    to set up and fetch only the particular commits needed for your current checkout.

    Git repositories have to be explicitly configured to allow fetching commits that don't have their own refname, I think git submodule chose not to rely on that since mostly people just clone.

    # set -x
    cd `git rev-parse --show-toplevel` || exit
    test -f .gitmodules || exit
    git submodule init
    modules=`git rev-parse --git-path modules`
    mkdir -p $modules
    git submodule | while read _ smname _; do
        smpath=`git config -f .gitmodules submodule.$smname.path` || continue
        smurl=`git config submodule.$smname.url` || continue
        smsha=`git rev-parse :$smpath` || continue
        test -e $smpath/.git || git init --template= --separate-git-dir=$modules/$smname $smpath
        ( cd $smpath
          git remote add origin $smurl 2>&-
          git cat-file -e $smsha || git fetch --depth 1 origin $smsha
          git checkout $smsha
          $0 # uncomment the $0 to make this recursive
        )
    done
    # git submodule
    

    This has passed a smoketests and some arbitrary beating, so it's at least not wildly off target.