I'm trying to use MUVERA compression with Jina ColBERT v2 embeddings in Weaviate, following the official documentation. However, MUVERA compression is not being applied: I'm still getting raw multi-vectors instead of compressed single vectors.
Following the Weaviate documentation example exactly:
import weaviate
from weaviate.classes.config import Configure, Property, DataType
import os
# Connect to Weaviate
client = weaviate.connect_to_local()
# Create collection with MUVERA configuration
collection = client.collections.create(
"DemoCollection",
vectorizer_config=[
Configure.NamedVectors.text2colbert_jinaai(
name="jina_colbert",
source_properties=["text"],
vector_index_config=Configure.VectorIndex.hnsw(
multi_vector=Configure.VectorIndex.MultiVector.multi_vector(
encoding=Configure.VectorIndex.MultiVector.Encoding.muvera(
# Optional parameters for tuning MUVERA
ksim=4,
dprojections=16,
repetitions=20,
)
)
)
)
],
properties=[
Property(name="text", data_type=DataType.TEXT)
]
)
# Insert test data
collection.data.insert(
properties={"text": "The quick brown fox jumps over the lazy dog"}
)
# Query and check vector format
result = collection.query.fetch_objects(limit=1, include_vector=True)
if result.objects:
obj = result.objects[0]
vec_data = obj.vector['jina_colbert']
if isinstance(vec_data[0], list):
print(f"Multi-vector: [{len(vec_data)}, {len(vec_data[0])}]")
else:
print(f"Single vector: {len(vec_data)} dimensions")
client.close()
If I understand the documentation, MUVERA should compress the multi-vector embeddings from ColBERT into a single vector. With default parameters (ksim=4, dprojections=16, repetitions=20), I expect the output to be:
Multi-vector: [16, 128]
I'm getting a raw multi-vector with 18 tokens × 128 dimensions. What am I missing?
Checking the created schema shows MUVERA in the configuration:
{
"multivector": {
"aggregation": "maxSim",
"enabled": true,
"muvera": {
"dprojections": 16,
"enabled": true,
"ksim": 4,
"repetitions": 20
}
}
}
MUVERA shows as enabled in the schema, but is the compression being applied to the vectors?
I verified my Weaviate version supports MUVERA (requires 1.31+, I'm on 1.31.5).
But maybe I'm misunderstanding how MUVERA works? I was expecting to get a single vector as output, but that doesn't seem to be the case.
Answer provided by Marcin Antas on the Weaviate Community Slack:
MUVERA is a vector index option, so the vectors inside that index are represented as single vectors (FDE's) so your
jina_colbert
vector index is built using single vectors.
Weaviate always stores original vector with object, so with this command:vec_data = obj.vector['jina_colbert']
you are accessing original vector associated with that object that is stored on disk, not the single vector that was created and stored in the vector index using MUVERA encoding.
You should see a decrease in the memory requirements, faster import times and QPS (query per second) rate.