pythongitpip

Missing file when installing a github repo with pip


I tried to install https://github.com/facebookresearch/perception_models by

pip install git+https://github.com/facebookresearch/perception_models.git

Expectation was that it would basically clone the repository as a package, but after running my code I found that one of the files (core/vision_encoder/bpe_simple_vocab_16e6.txt.gz) was missing. It worked after I manually copied it into the installed package, but this does not feel like the proper way.

README.md suggests usings git clone - is this the only way? I prefer not to clutter my repository with external modules that won't be modified.


Solution

  • You’re right - pip install git+... doesn’t include large files like bpe_simple_vocab_16e6.txt.gz because it doesn’t fetch LFS-stored files or certain extra assets. The method is to:

    Clone the repository, then install it in editable mode using:

    git clone https://github.com/facebookresearch/perception_models.git
    cd perception_models
    pip install -e .
    

    Using pip install git+.... won’t include those files - so cloning is the proper approach unless they update the packaging.