I'm trying to pass filters to redis retriever to do hybrid search on my embeddings (vector + metadata filtering). The following doesn't work! It fails to pass the filters and filters would always be None:
retriever = redis.as_retriever(
search_type="similarity_distance_threshold",
search_kwargs="{'include_metadata': True,'distance_threshold': 0.8,'k': 5}",
filter="(@launch:{false} @menu_text:(%%chicken%%))"
)
I found another example and apparently filter expression should be pass as search_kwargs, but I can't figure out what should be the correct syntax. If I do it as follow:
retriever = redis.as_retriever(
search_type="similarity_distance_threshold",
"retriever_search_kwargs":"{'include_metadata': True,'distance_threshold': 0.8,'k': 5, 'filter': '@menu_text:(%%chicken%%) @lunch:{true}'}",
}
it generates this search query:
similarity_search_by_vector > redis_query : (@content_vector:[VECTOR_RANGE $distance_threshold $vector] @menu_text:(%%chicken%%) @lunch:{true})=>{$yield_distance_as: distance}
and fails with the following error:
redis.exceptions.ResponseError: Invalid attribute yield_distance_as
Any idea how to fix it? System Info: langchain 0.0.346 langchain-core 0.0.10
python 3.9.18
It was a bug in Langchain! I found that '_prepare_range_query()' in langchain, is generating Redis query with wrong syntax. So I made the following small change which fixed the error for us:
def _prepare_range_query(
self,
k: int,
filter: Optional[RedisFilterExpression] = None,
return_fields: Optional[List[str]] = None,
) -> "Query":
try:
from redis.commands.search.query import Query
except ImportError as e:
raise ImportError(
"Could not import redis python package. "
"Please install it with `pip install redis`."
) from e
return_fields = return_fields or []
vector_key = self._schema.content_vector_key
base_query = f"@{vector_key}:[VECTOR_RANGE $distance_threshold $vector]"
if filter:
# base_query = "(" + base_query + " " + str(filter) + ")"
base_query = str(filter) + " " + base_query
query_string = base_query + "=>{$yield_distance_as: distance}"
return (
Query(query_string)
.return_fields(*return_fields)
.sort_by("distance")
.paging(0, k)
.dialect(2)
)