djangodjango-rest-frameworkdjango-treebeard

How to prefetch descendants with django-treebeard's MP_Node?


I'm developing an application with a hierarchical data structure in django-rest-framework using django-treebeard. My (simplified) main model looks like this

class Task(MP_Node):
    name = models.CharField(_('name'), max_length=64)
    started = models.BooleanField(default=True)

What I'm currently trying to achieve is a list view of all root nodes which shows extra fields (such as whether all children have started). To do this I specified a view:

class TaskViewSet(viewsets.ViewSet):

    def retrieve(self, request, pk=None):
        queryset = Task.get_tree().filter(depth=1, job__isnull=True)
        operation = get_object_or_404(queryset, pk=pk)
        serializer = TaskSerializer(operation)
        return Response(serializer.data)

and serializer

class TaskSerializer(serializers.ModelSerializer):
    are_children_started = serializers.SerializerMethodField()

    def get_are_children_started(self, obj):
        return all(task.started for task in Task.get_tree(obj))

This all works and I get the expected results. However, I run into a N+1 query problem where for each root task I need to fetch all children separately. Normally this would be solvable using prefetch_related but as I use the Materialized Path structure from django-treebeard there are no Django relationships between the task models, so prefetch_related doesn't know what to do out of the box. I've tried to use custom Prefetch objects but as this still requires a Django relation path I could not get it to work.

My current idea is to extend the Task model with a foreign key pointing to its root node like such:

root_node = models.ForeignKey('self', null=True,
                              related_name='descendant_tasks',
                              verbose_name=_('root task')
                              )

in order to make the MP relationship explicit so it can be queried. However, this does feel like a bit of a non-dry method of doing it so I wonder whether anyone has another suggestion on how to tackle it.


Solution

  • In the end I did end up with adding a foreign key to each task pointing to its root node like such:

    root_node = models.ForeignKey('self', null=True,
                              related_name='descendant_tasks',
                              verbose_name=_('root task')
                              )
    

    I updated my save method on my Task model to make sure I always point to the correct root node

    def save(self, force_insert=False, force_update=False, using=None, update_fields=None):
        try:
            self.root_task = self.get_root()
        except ObjectDoesNotExist:
            self.root_task = None
    
        return super(Task, self).save(force_insert=False, force_update=False, using=None,
                                      update_fields=None
                                      )
    

    and this allows me to simply prefetch all descendants using prefetch_related('descendants').

    Whenever I need to have the descendants in a nested fashion I use the following function to nest the flattened list of descendants again

    def build_nested(tasks):
    
        def get_basepath(path, depth):
            return path[0:depth * Task.steplen]
    
        container, link = [], {}
        for task in sorted(tasks, key=attrgetter('depth')):
            depth = int(len(task.path) / Task.steplen)
            try:
                parent_path = get_basepath(task.path, depth - 1)
                parent_obj = link[parent_path]
                if not hasattr(parent_obj, 'sub_tasks'):
                    parent_obj.sub_tasks = []
                parent_obj.sub_tasks.append(task)
            except KeyError:  # Append it as root task if no parent exists
                container.append(task)
    
            link[task.path] = task
    
        return container