pythonflaskhuggingface-transformersmilvus

Python script triggers an unexplained fetching operation on every run


I've recently been building a backend system using Python with Flask. Somehow an unexpected fetching operation appeared at the startup of my app.py (where app.run(port=) lies), and the server won't run until it completes fetching something.

This beheviour does not go away afterwards, consuming ~5 seconds each time I try to start the server. This causes bad debugging experience (Flask debug mode can't be enabled for some reason). I assume it's some AI stuff but can't figure out exactly what it is and what it's for. Maybe someone familiar with that can help?

screenshot after script run

Some of the packages used in this projects are:


Solution

  • The step of fetching files happens each time you load the embedding model. So, to get rid of it, you should load the model with local path without internet fetching (It seems you're using pymilvus.models , I will use this model as example: https://milvus.io/docs/embed-with-bgm-m3.md):

    1. Pre-download the model

      Manually download model to local directory or load it once to have it downloaded to the cache directory

      from pymilvus.model.hybrid import BGEM3EmbeddingFunction
      
      bge_m3_ef = BGEM3EmbeddingFunction(model_name="BAAI/bge-m3", device="cpu", use_fp16=False)
      
    2. Load the local model

      Option 1: Use offline mode

      import os
      
      os.environ["TRANSFORMERS_OFFLINE"] = "1"
      bge_m3_ef = BGEM3EmbeddingFunction(model_name="BAAI/bge-m3", device="cpu", use_fp16=False)
      

      Option 2: Load model from the local directory

      bge_m3_ef = BGEM3EmbeddingFunction(model_name="~/.cache/huggingface/hub/models--BAAI--bge-m3", device="cpu", use_fp16=False)