I am using elixir
livebook
to do embeddings, then use it to search my Qdrant
database:
{:ok, model_info} = Bumblebee.load_model({:hf, "BAAI/bge-base-en-v1.5"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "BAAI/bge-base-en-v1.5"})
# {:ok, model_info} = Bumblebee.load_model({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
# {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
serving = Bumblebee.Text.text_embedding(model_info, tokenizer)
text = "where do i change my password?"
emb = Nx.Serving.run(serving, text)
collection_name = "demo_v1"
Qdrant.search_points(collection_name, %{vector: emb.embedding, limit: 3})
I am getting an error:
{:error,
{Tesla.Middleware.JSON, :encode,
%Protocol.UndefinedError{
protocol: Jason.Encoder,
value: #Nx.Tensor<
f32[768]
[-0.8432725071907043, -0.5420605540275574, ...]
>,
description: "Jason.Encoder protocol must always be explicitly implemented.\n\nIf you own the struct, you can derive the implementation specifying which fields should be encoded to JSON:\n\n @derive {Jason.Encoder, only: [....]}\n defstruct ...\n\nIt is also possible to encode all fields, although this should be used carefully to avoid accidentally leaking private information when new fields are added:\n\n @derive Jason.Encoder\n defstruct ...\n\nFinally, if you don't own the struct you want to encode to JSON, you may use Protocol.derive/3 placed outside of any module:\n\n Protocol.derive(Jason.Encoder, NameOfTheStruct, only: [...])\n Protocol.derive(Jason.Encoder, NameOfTheStruct)\n"
}}}
from the error message, it seems like i still need to convert nx.tensor
embedding to some sort of JSON
format, before I can use it for the search_point/2
function in Qdrant
.
How can I resolve this?
It seems (although the docs are not explicit) that you are supposed to call Nx.to_list/1
to get JSON-serializable data:
Qdrant.search_points(collection_name, %{vector: Nx.to_list(emb.embedding), limit: 3})