gitgit-checkoutgit-fetchrevision-historygit-sparse-checkout

How do I clone, fetch or sparse checkout a single directory or a list of directories from git repository?


How do I clone, fetch or sparse checkout a single file or directory or a list of files or directories from a git repository avoiding downloading the entire history or at least keeping history download at minimum?

For the benefit of people landing here, these are references to other similar questions:

These similar questions were asked long ago and git evolved ever since, which ended up causing a flood of different answers, some better, some worse, depending on the version of git being considered. The trouble is that not a single answer from these aforementioned questions attend all requirements from all these questions combined, which means that you have to read all answers and compile in your head your own answer which eventually attend all requirements.

This question here expands on previous questions mentioned, imposing more flexible and stringent requirements than all other questions combined. So, once again:

How do I clone, fetch or sparse checkout a single file or directory or a list of files or directories from a git repository avoiding downloading the entire history or at least keeping history download at minimum?


Solution

  • This bash function below does the trick.

    function git_sparse_checkout {
        # git repository, e.g.: http://github.com/frgomes/bash-scripts
        local url=$1
        # directory where the repository will be downloaded, e.g.: ./build/sources
        local dir=$2
        # repository name, in general taken from the url, e.g.: bash-scripts
        local prj=$3
        # tag, e.g.: master
        local tag=$4
        [[ ( -z "$url" ) || ( -z "$dir" ) || ( -z "$prj" ) || ( -z "$tag" ) ]] && \
            echo "ERROR: git_sparse_checkout: invalid arguments" && \
            return 1
        shift; shift; shift; shift
    
        # Note: any remaining arguments after these above are considered as a
        # list of files or directories to be downloaded.
        
        mkdir -p ${dir}
        if [ ! -d ${dir}/${prj} ] ;then
            mkdir -p ${dir}/${prj}
            pushd ${dir}/${prj}
            git init
            git config core.sparseCheckout true
            local path="" # local scope
            for path in $* ;do
                echo "${path}" >> .git/info/sparse-checkout
            done
            git remote add origin ${url}
            git fetch --depth=1 origin ${tag}
            git checkout ${tag}
            popd
        fi
    }
    

    This is an example of how this can be used:

    function example_download_scripts {
      url=http://github.com/frgomes/bash-scripts
      dir=$(pwd)/sources
      prj=bash-scripts
      tag=master
      git_sparse_checkout $url $dir $prj $tag "user-install/*" sysadmin-install/install-emacs.sh
    }
    

    In the example above, notice that a directory must be followed by /* and must be between single quotes or double quotes.

    UPDATE: An improved version can be found at: https://github.com/frgomes/bash-scripts/blob/master/bin/git_sparse_checkout