djangosearchdjango-haystackwhoosh

Django Haystack indexing is not working for many to many field in model


I am using haystack in our django application for search and search is working very fine. But I am having an issue with reamtime search. For realtime search I am using haystack's default RealTimeSignalProcessor(haystack.signals.RealtimeSignalProcessor). My model contains one many to many field in it. When data is changed for this many to many field only, it seems the realtimesignal processor is not updating indexing data properly. After updating the many to many data, I am getting wrong search result.

Its working after manually running rebuild_index command. I think rebuild_index is working because its doing cleaning first and then again building indexing data.

Can someone suggest some solution to the problem ?

By the way following is code around it.

Model:

class Message_forum(models.Model):
      message = models.ForeignKey(Message)
      tags = models.ManyToManyField(Tag, blank=True, null=True) #this is many to many field

search_index.py:

class Message_forumIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.EdgeNgramField(document=True, use_template=True)
    message = indexes.CharField(model_attr='message', null=True)
    tags = indexes.CharField(model_attr='tags', null=True)

    def get_model(self):
        return Message_forum

    def index_queryset(self, using=None):
        return self.get_model().objects.all()

    def prepare_tags(self, obj):
        return [tag.tag for tag in obj.tags.all()]

index template:

{{ object.tags.tag }}

settings.py:

HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'

I am having latest version of haystack and whoosh as back-end.


Solution

  • I have figured it out after delving into the code of Haystack.

    In Haystack's default RealTimeSignalProcessor, it connects post_save and post_delete signals for each model in the application. Now, the handle_save method is triggered by both the post_save and post_delete signals. In this method, Haystack validates the sender. In my case, for the tags (many-to-many) field, the Message_forum_tag model is passed as the sender. However, the index for this model is not present in my search index because it's not my application's model but instead a model generated by Django. Therefore, in the handle_save method, it was ignoring any changes to this model and thus not updating the indexed data for the changed object.

    So, I have come up with two different solutions for this problem:

    I can create a custom real-time signal processor specific to my model Message_forum. In its setup method, I can connect the m2mchanged signal for each many-to-many field in Message_forum with handle_save. At the same time, I can set Message_forum as the sender so that Haystack will recognize (not exactly validate, but attempt to get its index object) it and update the index data of the changed object.

    Alternatively, whenever any many-to-many field is changed, I can ensure that the save method of its parent (here, Message_forum.save()) is called. This will always trigger the post_save signal, and subsequently, Haystack will update the index object data.

    Spent around 3 hours figuring this out. I hope this explanation helps someone facing a similar issue.