I have a Python single-dispatch generic function like this:
@singledispatch
def cluster(documents, n_clusters=8, min_docs=None, depth=2):
...
It is overloaded like this:
@cluster.register(QuerySet)
@lru_cache(maxsize=512)
def _(documents, *args, **kwargs):
...
The second one basically preprocesses a QuerySet
object and calls the generic cluster()
function.
A QuerySet is a Django object, but that should not play a role here; apart from the fact that it is hashable and thus usable with lru_cache
.
The generic function cannot be cached though because it accepts unhashable objects such as lists as arguments. However, the overloading function can be cached because a QuerySet
object is hashable. That is why I've added the @lru_cache()
annotation.
However, caching does not seem to be applied:
qs: QuerySet = [...]
start = datetime.now(); cluster(Document.objects.all()); print(datetime.now() - start)
0:00:02.629259
I would expect the same call to take place in an instance, but:
start = datetime.now(); cluster(Document.objects.all()); print(datetime.now() - start)
0:00:02.468675
This is confirmed by the cache statistics:
cluster.registry[django.db.models.query.QuerySet].cache_info()
CacheInfo(hits=0, misses=2, maxsize=512, currsize=2)
Changing the order of the @lru_cache
and the @.register
annotations does not seem to make a difference.
This question is similar, but the answer does not fit on the individual function level.
Is it even possible to combine these two annotations on this level? If so, how?
hash(Document.objects.all()) == hash(Document.objects.all())
is not consistent for Django QuerySet
.
The call Document.objects.all()
doesn't hit the database until the QuerySet
returned is evaluated.
Pickling is usually used as a precursor to caching
Depending on your use case you can try caching the pickle of the QuerySet
or its query
attribute.
@cluster.register(bytes)
@lru_cache(maxsize=512)
def _(documents, *args, **kwargs):
documents = pickle.loads(documents)
...
cluster(pickle.dumps(Document.objects.all()))
or
cluster(pickle.dumps(Document.objects.all().query))