djangodjango-mpttdjango-treebeardemail-threading

Sorting email threads in django using mail header information


I have a django app that stores email threads. When I parse the original emails from an mbox and insert them into the database I include the email header parameters 'message-id' and 'in-reply-to'. The message-id is a unique string that identifies the message, and the in-reply-to identifies a message that a given message is in response to.

Here is the message portion of my model:

class Message(models.Model):
    subject = models.CharField(max_length=300, blank=True, null=True)
    mesg_id = models.CharField(max_length=150, blank=True, null=True)
    in_reply_to = models.CharField(max_length=150, blank=True, null=True)
    orig_body = models.TextField(blank=True, null=True)

The goal is to be able to show email conversations in a threaded format similar to gmail. I was planning on just using the message-id (mesg_id in model) and in-reply-to (in_reply_to in model) from the mail headers to keep track of the mail and do the threading.

After reviewing SO and google it looks like I should be using a library like django-treebeard or django-mptt to do this. When I review the documentation for either of these two solutions it looks like most of the models are using foreign key relationships and this confuses me.

Given the example model above, how can I implement either django-treebeard or django-mptt into my app? Is this possible using the mesg_id and in_reply_to fields?


Solution

  • If I were implementing this, I might try it as follows - using django-mptt:

    from mptt.models import MPTTModel, TreeForeignKey
    
    class Message(MPTTModel):
        subject = models.CharField(max_length=300, blank=True)
        msg_id = models.CharField(max_length=150, blank=True) # unique=True) <- if msg_id will definitely be unique
        reply_to = TreeForeignKey('self', null=True, blank=True, related_name='replies')
        orig_body = models.TextField(blank=True)
    
        class MPTTMeta:
            parent_attr = 'reply_to'
    

    Note that I've turned reply_to into a ForeignKey. This means that if I have a Message instance msg I can simply do msg.reply_to to access the Message instance that it was a reply to, or msg.replies.all() to get all replies to the message.

    In theory, you could use the msg_id as a primary key field. I personally prefer keeping data separate from primary keys, but I don't know of a particular reason to think my way is better.