djangoannotationscountgeneric-relations

Annotate in Django with generic relations


I am using django-hitcount to tally hits to my database objects. I want to count the hits by object to determine which object has the most hits in a given time range. The app has two models of interest here:

class Hit(models.Model):
    created         = models.DateTimeField(editable=False)
    ip              = models.CharField(max_length=40, editable=False)
    session         = models.CharField(max_length=40, editable=False)
    user_agent      = models.CharField(max_length=255, editable=False)
    user            = models.ForeignKey(User,null=True, editable=False)
    hitcount        = models.ForeignKey(HitCount, editable=False)

class HitCount(models.Model):
    hits            = models.PositiveIntegerField(default=0)
    modified        = models.DateTimeField(default=datetime.datetime.utcnow)
    content_type    = models.ForeignKey(ContentType,
                        verbose_name="content cype",
                        related_name="content_type_set_for_%(class)s",)
    object_pk       = models.TextField('object ID')
    content_object  = generic.GenericForeignKey('content_type', 'object_pk')

"Hit" logs each hit with a timestamp while HitCount stores overall number of hits. To get the hits by object and within a time range, I need to do the following:

Filter Hit by date created Count number of hits per content_object (within time range filtered above) order by count calculated above return content_object and count

This could be very expensive, so I planned on calcing/caching once per day.

As a first step, I wanted to count the number of hits per content_object irrespective of time range.

limited_hc = Hit.objects.all().values('hitcount__content_object').annotate(count = Count('hitcount__object_pk'))

I immediately run into an issue:

Cannot resolve keyword 'hitcount__content_object' into field. Choices are: created, hitcount, id, ip, session, user, user_agent

After some digging, I found that annotation and generic relations do not work well together. If I use object_pk instead of content_object, it works fine, but then I don't have the name of the object.

So my question: What is an alternative to achieve the same result? How can group by object but also retain the name?

I do have the model (content_type) and the id (object_pk), so I could always pull these separately, but that seems inelegant. . .


Solution

  • It will probably be more efficient for your purpose to add a generic relation to Hit model:

    class Hit(models.Model):
        ...
        object_id = models.PositiveIntegerField()
        content_type = models.ForeignKey(ContentType)
        content_object = generic.GenericForeignKey('content_type', 'object_id')
    

    and then run count() query on the Hit directly:

    t = ContentType.objects.get_for_model(the_object_being_hit)
    id = the_object_being_hit.id
    count = Hit.objects.filter(
                       created__range=(from_timestamp, to_timestamp),
                       content_type = t,
                       object_id = id
                     ).count()
    

    You can use Django South migration system to modify the hit-count's model. You can also try to subclass Hit after monkey-patching it's Meta class, or just define your own models that suit your needs better.

    edit If you want to count hits for a whole class of objects or several classes, then you can have:

    count = Hit.objects.filter(
                         created__range = myrange,
                         content_type__in = set_of_types
                       ).count()
    

    Where the set_of_types can be either a list constructed with get_for_model calls or a query set obtained by direct filtering of the ContentType table.

    The nice part of the count() method is that it makes counting happen in the database, which is much faster.

    To get a breakdown by content_type try this:

    counts = Hit.objects.filter(
                       created__range = myrange
                    ).values(
                       'content_type'
                    ).annotate(
                       Count('content_type')
                    )
    

    That should return a dictionary of counts vs content type id, quite close to what you want.