pythondjangopostwebhooksmailchimp

Parse Nested Url Encoded Data from POST Request


I am currently creating a callback URL in Django for a webhook in Mailchimp where Mailchimp will send a POST request with urlencoded data in the form of application/x-www-form-urlencoded.

The issue I have run into is that the data returned contains nested data. Some of the data in this urlencoded string looks like its defining nested JSON, which I believe is non-standard (I could be mistaken, though).

For example, one POST request from Mailchimp, which is sent when a user changes their name, would look like:

type=profile&fired_at=2021-05-25+18%3A03%3A23&data%5Bid%5D=abcd1234&data%5Bemail%5D=test%40domain.com&data%5Bemail_type%5D=html&data%5Bip_opt%5D=0.0.0.0&data%5Bweb_id%5D=1234&data%5Bmerges%5D%5BEMAIL%5D=test%40domain.com&data%5Bmerges%5D%5BFNAME%5D=first_name&data%5Bmerges%5D%5BLNAME%5D=last_name&data%5Blist_id%5D=5678

Using Django's request.POST, the data is decoded into:

{
    'type': 'profile',
    'fired_at': '2021-05-25 18:03:23',
    'data[id]': 'abcd1234',
    'data[email]': 'test@domain.com',
    'data[email_type]': 'html',
    'data[ip_opt]': '0.0.0.0',
    'data[web_id]': '1234',
    'data[merges][EMAIL]': 'test@domain.com',
    'data[merges][FNAME]': 'first_name',
    'data[merges][LNAME]': 'last_name',
    'data[list_id]': '5678'
}

This looks really ugly in practice, since to access the first name of the user from request.POST we would have to do

request.POST.get("data['merges']['FNAME']", None)

The data is obviously intended to look like

{
    'type': 'profile',
    'fired_at': '2021-05-25 18:03:23',
    'data': {
        'id': 'abcd1234',
        'email': 'test@domain.com',
        'email_type': 'html',
        'ip_opt': '0.0.0.0',
        'web_id': '1234',
        'merges':{
            'email': 'test@domain.com',
            'fname': 'first_name',
            'lname': 'last_name',
        },
        'list_id': '5678'
    },
}

and be accessed like

data = request.POST.get('data', None)
first_name = data['merges']['FNAME']

I have looked for a Django/Python specific way to decode this nested URL-encoded data into more appropriate formats to work with it in Python, but have been unable to find anything. Python's urllib library provides methods such as urllib.parse.parse_qs() to decode urlencoded strings, but these methods do not handle this nested type data.

Is there a way to properly decode this nested urlencoded data using Django/Python?


Solution

  • There is no standard library nor Django utility function for this.

    We can implement convert_form_dict_to_json_dict as such:

    1. Initialise json_dict to an empty dict {}.
    2. For each form_key, using the example 'data[merges][EMAIL]',
      1. Use regex to obtain nested_keys, i.e. ('data', 'merges', 'EMAIL').
      2. Determine last_nesting_level, i.e. 2 from nesting levels (0, 1, 2).
      3. Initialise current_dict to json_dict.
      4. For each nesting_level, current_key, i.e. 0, 'data', 1, 'merges', 2, 'EMAIL',
        1. If it is before last_nesting_level, get next current_dict using current_key.
        2. Else, set current_dict entry for current_key to value.
    3. Return json_dict.
    import re
    
    
    def convert_form_dict_to_json_dict(form_dict):
        json_dict = {}
        for form_key, value in form_dict.items():
            nested_keys = (re.match(r'\w+', form_key).group(0), *re.findall(r'\[(\w+)]', form_key))
            last_nesting_level = len(nested_keys) - 1
            current_dict = json_dict
            for nesting_level, current_key in enumerate(nested_keys):
                if nesting_level < last_nesting_level:
                    current_dict = current_dict.setdefault(current_key, {})
                else:
                    current_dict[current_key] = value
        return json_dict
    

    Usage:

    POST_dict = {
        'type': 'profile',
        'fired_at': '2021-05-25 18:03:23',
        'data[id]': 'abcd1234',
        'data[email]': 'test@domain.com',
        'data[email_type]': 'html',
        'data[ip_opt]': '0.0.0.0',
        'data[web_id]': '1234',
        'data[merges][EMAIL]': 'test@domain.com',
        'data[merges][FNAME]': 'first_name',
        'data[merges][LNAME]': 'last_name',
        'data[list_id]': '5678'
    }
    
    from pprint import pprint
    pprint(convert_form_dict_to_json_dict(POST_dict))