How to combine @singledispatch and @lru_cache?

Question

How to combine @singledispatch and @lru_cache?

296 views Asked by Carsten At 26 February 2020 at 14:15

I have a Python single-dispatch generic function like this:

@singledispatch
def cluster(documents, n_clusters=8, min_docs=None, depth=2):
  ...

It is overloaded like this:

@cluster.register(QuerySet)
@lru_cache(maxsize=512)
def _(documents, *args, **kwargs):
  ...

The second one basically preprocesses a QuerySet object and calls the generic cluster() function. A QuerySet is a Django object, but that should not play a role here; apart from the fact that it is hashable and thus usable with lru_cache.

The generic function cannot be cached though because it accepts unhashable objects such as lists as arguments. However, the overloading function can be cached because a QuerySet object is hashable. That is why I've added the @lru_cache() annotation.

However, caching does not seem to be applied:

qs: QuerySet = [...]

start = datetime.now(); cluster(Document.objects.all()); print(datetime.now() - start)               
0:00:02.629259

I would expect the same call to take place in an instance, but:

start = datetime.now(); cluster(Document.objects.all()); print(datetime.now() - start)               
0:00:02.468675

This is confirmed by the cache statistics:

cluster.registry[django.db.models.query.QuerySet].cache_info()
CacheInfo(hits=0, misses=2, maxsize=512, currsize=2)

Changing the order of the @lru_cache and the @.register annotations does not seem to make a difference.

This question is similar, but the answer does not fit on the individual function level.

Is it even possible to combine these two annotations on this level? If so, how?

Original Q&A

There are 1 answers

**Nizam Mohamed** · Accepted Answer · 2020-03-05T12:27:23+00:00

hash(Document.objects.all()) == hash(Document.objects.all()) is not consistent for Django QuerySet.

The call Document.objects.all() doesn't hit the database until the QuerySet returned is evaluated.

Pickling is usually used as a precursor to caching

Django docs.

Depending on your use case you can try caching the pickle of the QuerySet or its query attribute.

@cluster.register(bytes)
@lru_cache(maxsize=512)
def _(documents, *args, **kwargs):
    documents = pickle.loads(documents)
    ...

cluster(pickle.dumps(Document.objects.all()))

or

cluster(pickle.dumps(Document.objects.all().query))

TechQA.

How to combine @singledispatch and @lru_cache?

There are 1 answers

Related Questions in PYTHON

Related Questions in ANNOTATIONS

Related Questions in GENERIC-PROGRAMMING

Related Questions in LRU

Related Questions in SINGLE-DISPATCH

Popular Questions

Trending Questions