I've recently been building a backend system using Python with Flask. Somehow an unexpected fetching operation appeared at the startup of my app.py
(where app.run(port=)
lies), and the server won't run until it completes fetching something.
This beheviour does not go away afterwards, consuming ~5 seconds each time I try to start the server. This causes bad debugging experience (Flask debug mode can't be enabled for some reason). I assume it's some AI stuff but can't figure out exactly what it is and what it's for. Maybe someone familiar with that can help?
Some of the packages used in this projects are:
The step of fetching files happens each time you load the embedding model. So, to get rid of it, you should load the model with local path without internet fetching (It seems you're using pymilvus.models
, I will use this model as example: https://milvus.io/docs/embed-with-bgm-m3.md):
Pre-download the model
Manually download model to local directory or load it once to have it downloaded to the cache directory
from pymilvus.model.hybrid import BGEM3EmbeddingFunction
bge_m3_ef = BGEM3EmbeddingFunction(model_name="BAAI/bge-m3", device="cpu", use_fp16=False)
Load the local model
Option 1: Use offline mode
import os
os.environ["TRANSFORMERS_OFFLINE"] = "1"
bge_m3_ef = BGEM3EmbeddingFunction(model_name="BAAI/bge-m3", device="cpu", use_fp16=False)
Option 2: Load model from the local directory
bge_m3_ef = BGEM3EmbeddingFunction(model_name="~/.cache/huggingface/hub/models--BAAI--bge-m3", device="cpu", use_fp16=False)