I am running a RAG pipeline, with LlamaIndex and quantized LLama3-8B-Instruct. I just installed these libraries:
!pip install --upgrade huggingface_hub
!pip install --upgrade peft
!pip install llama-index bitsandbytes accelerate llama-index-llms-huggingface llama-index-embeddings-huggingface
!pip install --upgrade transformers
!pip install --upgrade sentence-transformers
Then I was looking to run the quantization pipeline like this:
import torch
from llama_index.llms.huggingface import HuggingFaceLLM
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)
However, I got this error returned to me: ModuleNotFoundError: No module named 'huggingface_hub.inference._types'
. Last time I worked with this pipeline two months ago, the code worked, so I think LlamaIndex has changed something; especially since when I clicked on the error, it referenced to: from huggingface_hub.inference._types import ConversationalOutput
, but ConversationalOutput
module doesn't exist in HuggingFace docs.
So, what should I do to fix this error and be able to run this RAG pipeline?
ModuleNotFoundError
indicates the code is importing a module that does not exist. Its due to a dependency mismatch between the version of huggingface_hub
that you installed and the version compatible with llama_index
.
llama_index
uses a module that was recently deleted from the huggingface_hub.inference
package called _types
. We can infer this from the import error you posted:
from huggingface_hub.inference._types import ConversationalOutput
In v0.24.0 the _types
module exists. It was removed in v0.25.0.
You need to uninstall hugginface_hub
and install a version compatible with llama_index
. I'd try 0.24.0 since it has the module currently causing the ModuleNotFoundError
:
pip uninstall huggingface-hub
pip install huggingface-hub==0.24.0
Upgrades to external dependencies should not break an existing project's workflow because projects should carefully manage their dependencies and the versions associated with those dependencies.
In Python, the convention is to use a requirements.txt
file that lists the dependencies/versions required for a project. You can export your dependencies by running:
pip freeze > requirements.txt
This will capture dependencies as well as versions. The dependencies specified in a requirements file can be imported back by running:
pip install -r requirements.txt
To isolate dependencies for different projects, you should use virtual environments:
python -m venv llama
source llama/bin/activate
This will create an isolated python environment for your project that has no dependencies installed. It provides a fresh slate for every project and ensures one project's dependencies don't interfere with another's.
When capturing dependencies in a requirements file, you should only include required dependencies for the current project, not all the dependencies on your system. Using pip freeze in conjunction with virtual environments makes this easy.
You should upgrade an external dependencies when a new version is released. This ensures you have the library's latest features and is essential for safeguarding against known security vulnerabilities.
This should be done in a structured manner:
If the update does not cause issues that's great. If it does you must either revert to the latest working version of the dependency or refactor your code to work with the new library.