langchainembeddingdtypechromadbvector-database

Precision used in ChromaDB Index


I am using BAAI/bge-large-en-v1.5 model to embed and then store these embeddings in ChromaDB vector-store. These embeddings are in the memory and using HNSW indexing. Is there a way I can find out the dtype or precision of these embeddings if they are float32, float64 or something else?

Thanks


Solution

  • like this:

    import chromadb
    
    # Initialize the DB
    client = chromadb.PersistentClient(path="./chroma_db")  # Adjust the path as needed
    collection = client.get_collection("my_collection")
    
    # Get a vector by id
    vector_data = collection.get(ids=["your_vector_id"], include=["embeddings"])
    
    # Check dtype
    if "embeddings" in vector_data and vector_data["embeddings"]:
        vector_array = np.array(vector_data["embeddings"])  # Convert to NumPy array
        print("Vector dtype:", vector_array.dtype)
    
    

    By default, ChromaDB stores vectors as float32