pythondjangomemcacheddjango-querysetdjango-cache

Caching a Django queryset for the calendar date


I have a query which results only change once a day. Seems like a waste to be performing that query every request I get for that page. I am investigating using memcached for this.

How would I begin? Anyone have any suggestions or pitfalls I should avoid in using Django's caching? Should I cache at the template or at the view?

This question might seem vague but it's only because I've never dealt with caching before. So if there's something I could elaborate on, please just ask.

Elaboration

Per Ken Cochrane:

  1. How often does this data change: The relevant data would be locked in for that calendar date. So, for example, I'll pull the data for 1/30/2011 and I'm okay with serving that cached copy for the whole day until 1/31/2011 where it would be refreshed.

  2. Do I use this data in more then one place: Only in one view.

  3. How much data is it going to be: An average of 10 model objects that contain about 15 fields with the largest being a CharField(max_length=120). I will trim the number of fields down using values() to about half of those.


Solution

  • Normally before I decide where to do the caching I ask myself a few questions.

    1. How often does this data change
    2. Do I use this data in more then one place
    3. How much data is it going to be

    Since I don't know all of the details for your application, I'm going to make some assumptions.

    1. you have a view that either takes in a date or uses the current date to query the database to pull out all of the calender events for that date.
    2. you only display this information on one template,
    3. The amount of data isn't too large (less then 100 entries).

    With these assumptions you have 3 options. 1. cache the templates 2. cache the view 3. cache the queryset

    Normally when I do my caching I cache the queryset, this allows me greater control of how I want to cache the data and I can reuse the same cached data in more then one place.

    The easiest way that I have found to cache the queryset is to do this in the ModelManger for the model in question. I would create a method like get_calender_by_date(date) that will handle the query and caching for me. Here is a rough mockup

    CACHE_TIMEOUT_SECONDS = 60 * 60 * 24 # this is 24 hours
    
    class CalendarManager(models.Manager):
    
        def get_calendar_by_date(self, by_date):
            """ assuming date is a datetime object """
            date_key = by_date.strftime("%m_%d_%Y")
            cache_key = 'CAL_DATE_%s' % (date_key)
            cal_date = cache.get(cache_key)
            if cal_date is not None:
                return cal_date
    
            # not in cache get from database
            cal_date = self.filter(event_date=by_date)
    
            # set cal_date in cache for later use
            cache.set(cache_key, cal_date, CACHE_TIMEOUT_SECONDS)
            return cal_date
    

    Some things to look out for when caching

    1. Make sure the objects that you are storing in the cache can be pickled
    2. Since memcache doesn't know what day it is you need to make sure you don't over cache. For example if it was Noon on Jan 21st and you cache for 24 hours, that calendar information will show up until Noon on Jan 22nd and that might not be what you are looking for, so make sure when you set the time of the query you either set it to a small value so it expires quicker or you calculate how long to cache so that it expires when you want it to expire.
    3. Make sure you know the size of the objects you want to cache. If your memcache instance only have 16MB of storage but you want to store 32MB of data, the cache isn't going to do you much good.

    When caching the template or view you need to watch out for the following

    1. set your cache timeout so that it isn't too large, I don't think you can programtically change the template cache timeout, and it is hard coded, so if you set it too high you will end up having a page that is out of date. You should be able to programaticly change the cache time, so it is a little safer.
    2. If you are caching the template and there is other information on the template that is dynamic and changes all of the time, make sure that you only put the cache tags around the section of the page you want cached for a while. If you put it in the wrong place you might end up the wrong result.

    Hopefully that gives you enough information to get started. Good Luck.