[SOLVED] Configure Multitenancy with Langchain and Qdrant

Configure Multitenancy with Langchain and Qdrant

I'm creating a Q&A chatbot and I'm using langchain and qdrant.

I'm trying to configure langchain to be able to use qdrant in a multitenant environment. The doc from qdrant says that the best approach in my case is to use a "Partition by payload" and use a group_id = OneClient inside the payload of each element of a collection, so that then it's possible to filter on that group_id (which in my case will be the client). That's the link to the doc https://qdrant.tech/documentation/tutorials/multiple-partitions/

I'm using langchain and I have added to the documents that I'm saving inside qdrant a "group_id" metadata field.

I'd like to understand how to filter on group_id when I use langchain. This is how I'm using langchain to retrieve the answer to a question:

qdrant = Qdrant(
    client=QdrantClient(...),
    collection_name="collection1",
    embeddings=embeddings
)
prompt = ...
llm = ChatOpenAI(...) 
qa_chain = RetrievalQAWithSourcesChain.from_chain_type(
     llm=llm,
     chain_type="stuff",
     return_source_documents=True,
     retriever=qdrant.as_retriever(),
     chain_type_kwargs = {"prompt": prompt}
 )
result = qa_chain({"question": question})

The group_id will represent the client and it is known before the question.

Any help is much appreciated, Thanks.

Solution

I have found the answer. Thanks for all the suggestions.

To filter on an attribute "group_id" which is the client_id, I'm adding a metadata group_id = client when I load some data with "VectoreStore.from_documents" and I'm using the "as_retriever" function to pass the search filter and return only the sources with that group_id:

chain = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type=chain_type,
    max_tokens_limit=max_tokens_limit,
    return_source_documents=True,
    retriever=vectorstore.as_retriever(
        search_kwargs={'filter': {'group_id': client}}
    ),
    reduce_k_below_max_tokens=False,
    chain_type_kwargs = {"prompt": prompt}
)