pythondjangorestdjango-rest-frameworkhttp-post

How to access request body when using Django Rest Framework and avoid getting RawPostDataException


I need to get the raw content of POST request body (as a string) yet when I try to access request.body I'm getting an exception:

django.http.request.RawPostDataException:
You cannot access body after reading from request's data stream

I am aware that it is adviced to use request.data instead of request.body when using Django Rest Framework, yet for the purpose of validating digital signature I have to have the request body in a raw and "untouched" form, since this is what 3rd-party signed and what I need to validate.

Pseudocode:

3rd_party_sign(json_data + secret_key) != validate_sign(json.dumps(request.data) + secret_key)

3rd_party_sign(json_data + secret_key) == validate_sign(request.body + secret_key)

Solution

  • I have found interesting topic on DRFs GitHub, but it does not fully cover the problem. I have investigated the case and came up with a neat solution. Surprisingly there was no such question on SO, so I decided to add it for public following the SO self-answer guidelines.

    The key for understanding the problem and solution is how the HttpRequest.body (source) works:

    @property
    def body(self):
        if not hasattr(self, '_body'):
            if self._read_started:
                raise RawPostDataException("You cannot access body after reading from request's data stream")
            # (...)
            try:
                self._body = self.read()
            except IOError as e:
                raise UnreadablePostError(*e.args) from e
            self._stream = BytesIO(self._body)
        return self._body
    

    When accessing body - if the self._body is already set its simply returned, otherwise the internal request stream is being read and assigned to _body: self._body = self.read(). Since then any further access to body falls back to return self._body. In addition before reading the internal request stream there is a if self._read_started check which raises an exception if "read has started".

    The self._read_started flague is being set by the read() method (source):

    def read(self, *args, **kwargs):
        self._read_started = True
        try:
            return self._stream.read(*args, **kwargs)
        except IOError as e:
            six.reraise(UnreadablePostError, ...)
    

    Now it should be clear that the RawPostDataException will be raised after accessing the request.body if only the read() method has been called without assigning its result to requests self._body.

    Now lets have a look at DRF JSONParser class (source):

    class JSONParser(BaseParser):
        media_type = 'application/json'
        renderer_class = renderers.JSONRenderer
    
        def parse(self, stream, media_type=None, parser_context=None):
            parser_context = parser_context or {}
            encoding = parser_context.get('encoding', settings.DEFAULT_CHARSET)
            try:
                data = stream.read().decode(encoding)
                return json.loads(data)
            except ValueError as exc:
                raise ParseError('JSON parse error - %s' % six.text_type(exc))
    

    (I have chosen slightly older version o DRF source, cause after May 2017 there have been some performance improvements that obscure the key line for understanding our problem)

    Now it should be clear that the stream.read() call sets the _read_started flague and therefore it is impossible for the body property to access the stream once again (after the parser).

    The solution

    The "no request.body" approach is a DRF intention (I guess) so despite it is technically possible to enable access to request.body globally (via custom middleware) - it should NOT be done without deep understanding of all its consequences.

    The access to the request.body property may be explicitly and locally granted in the following manner:

    You need to define custom parser:

    import json
    from django.conf import settings
    from rest_framework.exceptions import ParseError
    from rest_framework import renderers
    from rest_framework.parsers import BaseParser
    
    class MyJSONParser(BaseParser):
        media_type = 'application/json'
        renderer_class = renderers.JSONRenderer
    
        def parse(self, stream, media_type=None, parser_context=None):
            parser_context = parser_context or {}
            encoding = parser_context.get('encoding', settings.DEFAULT_CHARSET)
            request = parser_context.get('request')
            try:
                data = stream.read().decode(encoding)
                setattr(request, 'raw_body', data) # setting a 'body' alike custom attr with raw POST content
                return json.loads(data)
            except ValueError as exc:
                raise ParseError('JSON parse error - %s' % six.text_type(exc))
    

    Then it can be used when it is necessary to access raw request content:

    @api_view(['POST'])
    @parser_classes((MyJSONParser,))
    def example_view(request, format=None):
        return Response({'received data': request.raw_body})
    

    While request.body still remains globally inaccessible (as DRF authors intended).