I am trying to set an index to a nested field inside Redis to search over it easily, specifically a numeric field representing a timestamp, but I can't figure it out. The documentation is quite complicated and ever since RedisSearch was merged with main Redis, I've been struggling to find any good examples.
Here's my attempt:
import time
from redis import Redis
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.field import NumericField
from redis.commands.search.query import Query, NumericFilter
def main():
r = None
test_dict1 = {"context": {"test": {"other": "test"}, "messages": [{"text": "mytext", "timestamp": str(time.time())}]}}
test_dict2 = {"context": {"test": {"other": "test"}, "messages": [{"text": "mytext2", "timestamp": str(time.time() + 10)}]}}
try:
r = Redis()
r.json().set("uuid:4587-7d5f9-4545", "$", test_dict1)
r.json().set("uuid:4587-7d5f9-4546", "$", test_dict2)
r.ft('timestamp').create_index(fields=(NumericField("$.messages.timestamp")), definition=IndexDefinition(prefix=['uuid:'], index_type=IndexType.HASH))
print(r.json().get("uuid:4587-7d5f9-4545", "$.context.test.other"))
q = Query("*").add_filter(NumericFilter(field="$.messages.timestamp", minval=0, maxval=time.time()))
print(r.ft('timestamp').search(q))
except Exception as e:
raise e
finally:
if r is not None:
r.flushall()
if __name__ == "__main__":
main()
That currently returns 0 results, but doesn't throw any errors.
There's a few problems here. First, your dictionary contains the timestamps as strings and they are indexed as numeric. That will silently fail because of the type mismatch. So, replace that with:
test_dict1 = {"context": {"test": {"other": "test"}, "messages": [{"text": "mytext", "timestamp": time.time()}]}}
test_dict2 = {"context": {"test": {"other": "test"}, "messages": [{"text": "mytext2", "timestamp": time.time() + 10}]}}
Secondly, you've got a typo in your field definition as you don't actually have a JSON key at $.messages.timestamp
, it's at $.context.messages.[*].timestamp
so you need to change your index definition. For the sake of readability you might want to include an alias for that field. Finally, as @simon-prickett says, you are indexing the documents as hashes so you need to declare it as a JSON index:
r.ft('timestamp').create_index(fields=(NumericField("$.context.messages.[*].timestamp", as_name = "ts")), definition=IndexDefinition(prefix=['uuid:'], index_type=IndexType.JSON))
Once that's done you can query as
q = Query("*").add_filter(NumericFilter(field="ts", minval=0, maxval=time.time()))
and get your results.