djangocsvdjango-rest-frameworkexport-to-csvxlsx

How to denormalize hierarchical Django model data for rendering CSV or Excel with Djange REST Framework?


Assume, we are building an address book application with Django REST Framework and want to render an endpoint that exports all persons. Each person can have one or multiple phone numbers.

Exemplary data could look like this:

[
  {
    'name': 'Jon Doe',
    'phone': 
    [
      {
        'type': 'home',
        'number': '+1 234 5678'
      }
    ]
  },
  {
    'name': 'Jane Doe',
    'phone':
    [
      {
        'type': 'home',
        'number': '+2 345 6789'
      },
      {
        'type': 'work',
        'number': '+3 456 7890'
      }
    ]
  }
]   

As we want to export CSV or Excel tables, we want to denormalize the data so that each phone number gets its own line.

A result could look like this:

name,phone.type,phone.number
Jon Doe,home,+1 234 5678
Jane Doe,home,+2 345 6789
Jane Doe,work,+3 456 7890

The question is, where exactly I would do the denormalization. I see two options:

  1. Write a custom Serializer that does the denormalization. On the upside, this would result in a single change that works for every Renderer, so I could have the endpoint export CSV and Excel using, e.g. djangorestframework-csv and drf-renderer-xlsx. On the downside, this would interfere with renderers that do not benefit from denormalization like JSON or XML.
  2. Derive each Renderer that needs denormalization and override the process_data() method to first denormalize and then invoke the super-class implementation.
  3. Write a custom View that does the denormalization based on the negotiated renderer, like described in https://www.django-rest-framework.org/api-guide/renderers/#varying-behavior-by-media-type.

This seems like an issue that many people could have, as tabular data export is a very common feature. Any hints on where I should start or what would be the best alternative?


Solution

  • Assuming you have models such as:

    class Person(models.Model):
        name = models.CharField(max_length=100)
        def __str__(self):
            return self.name
    
    class Phone(models.Model):
        person = models.ForeignKey(Person, on_delete=models.CASCADE)
        type = models.CharField(max_length=100)
        number = models.CharField(max_length=100)
        def __str__(self):
            return self.number
    

    Then create a serializer and viewset for Phone (rather than Person) like so:

    class PhoneSerializer(serializers.HyperlinkedModelSerializer):
        person = serializers.StringRelatedField()
        class Meta:
            model = Phone
            fields = ['person', 'type', 'number']
    
    class PhoneViewset(viewsets.ModelViewSet):
        queryset = Phone.objects.all()
        serializer_class = PhoneSerializer
    
    router = routers.DefaultRouter()
    router.register(r'phone', PhoneViewset)
    
    

    Then DRF will produce something like this with its default json renderer:

    [
        {"person":"Mufune Toshirō","type":"home","number":"000-123-4567"},
        {"person":"Mufune Toshirō","type":"work","number":"000-345-6789"},
        {"person":"Tōno Eijirō","type":"home","number":"000-234-4567"},
        {"person":"Nakadai Tatsuya","type":"home","number":"000-234-6789"},
        {"person":"Tsukasa Yōko","type":"cell","number":"000-987-6543"}
    ]
    

    To get CSV output, you can install djangorestframework-csv. It turns the above into this:

    number,person,type
    000-123-4567,Mufune Toshirō,home
    000-345-6789,Mufune Toshirō,work
    000-234-4567,Tōno Eijirō,home
    000-234-6789,Nakadai Tatsuya,home
    000-987-6543,Tsukasa Yōko,cell
    

    On a real database with lots of tables and complicated relationships, I would create a database view from a query that contained all of the fields denormalized how I wanted them to be. Then, I'd create an unmanaged model (something like this) so it could be exported from DRF.

    In the above example, the view could be something like this:

    CREATE VIEW phonelist AS
        SELECT a.name, b.type, b.number  
        FROM Person a left join Phone b on (a.id = b.person);