pythonpython-3.xdictionarysorting

How to properly sort a nested dictionary by keys in Python3, where we take in account numbers from key strings


Assuming a dictionary that looks like this, where we have a string with some number for the key (not necessarily in any order, nor is it guaranteed that it would be in a continues sequence, and the only pattern that the keys would follow is that the random key string would have a number at the end of the string after a -):

nested_dict = {
    "outter_key_layer-3": {
        "inner_key_layer-4": "random_value",
        "inner_key_layer-2": "random_value",
        ...
        "inner_key_layer-11": "random_value",
        "inner_key_layer-22": "random_value",
        ...
        ...
        "inner_key_layer-31": "random_value",
        "inner_key_layer-112": "random_value",
        ...
    },
    "outter_key_layer-54": {
        "inner_key_layer-1": "random_value",
        "inner_key_layer-2": "random_value",
        ...
        "inner_key_layer-11": "random_value",
        "inner_key_layer-112": "random_value",
        ...
        "inner_key_layer-12": "random_value",
        "inner_key_layer-11": "random_value",
        ...
    },
    ...
    "outter_key_layer-13": {
        "inner_key_layer-1": "random_value",
        "inner_key_layer-2": "random_value",
        ...
        "inner_key_layer-5": "random_value",
        "inner_key_layer-10": "random_value",
        ...
        "inner_key_layer-6": "random_value",
        "inner_key_layer-23": "random_value",
        ...
    },
    ...
}

How could I achieve the following, where the keys are both in alphabetical order, but the numbers are also taken into consideration as integers for sorting:

nested_dict = {
    "outter_key_layer-1": {
        "inner_key_layer-1": "random_value",
        "inner_key_layer-2": "random_value",
        ...
        "inner_key_layer-11": "random_value",
        "inner_key_layer-12": "random_value",
        ...
        ...
        "inner_key_layer-111": "random_value",
        "inner_key_layer-112": "random_value",
        ...
    },
    "outter_key_layer-2": {
        "inner_key_layer-1": "random_value",
        "inner_key_layer-2": "random_value",
        ...
        "inner_key_layer-11": "random_value",
        "inner_key_layer-12": "random_value",
        ...
        "inner_key_layer-111": "random_value",
        "inner_key_layer-112": "random_value",
        ...
    },
    ...
    "outter_key_layer-11": {
        "inner_key_layer-1": "random_value",
        "inner_key_layer-2": "random_value",
        ...
        "inner_key_layer-11": "random_value",
        "inner_key_layer-12": "random_value",
        ...
        "inner_key_layer-111": "random_value",
        "inner_key_layer-112": "random_value",
        ...
    },
    ...
}

Attempted to modify the following from a previously asked question, but I can't get anything to work like attempting to filter out the integers and get them sorted somehow:

sorted_dict = {key: dict(sorted(nested_dict[key].items())) for key in sorted(nested_dict)}

With the above the nested inner keys, wouldn't be sorted as expected, the numeric values would still be taken as strings: inner_key_layer-1, inner_key_layer-11, ..., inner_key_layer-12, inner_key_layer-2...

I know there are a lot of different use case questions related to sorting things with python, but any help or suggestions would be welcomed. Thank you.

LE: After doing some further research I stumbled upon the natsort package, and using it this way basically yields what I'm looking for:

from natsort import natsorted


my_sorted_dict = {key: dict(natsorted(nested_dict[key].items())) for key in sorted(nested_dict)}

However I'm still wondering if there's a way of doing this without using or having to install another package.


Solution

  • The trick is to split the keys to have a list of items for sorting. Remember to convert the last random number to int(). The items in the list are compared element-wise.

    Here I used recursive approach but since it is only one level nested, you could replace it with another for loop:

    from pprint import pprint
    
    nested_dict = {
        "outter_key_layer-3": {
            "inner_key_layer-4": "random_value",
            "inner_key_layer-2": "random_value",
            "inner_key_layer-11": "random_value",
            "inner_key_layer-22": "random_value",
            "inner_key_layer-31": "random_value",
            "inner_key_layer-112": "random_value",
        },
        "outter_key_layer-54": {
            "inner_key_layer-1": "random_value",
            "inner_key_layer-2": "random_value",
            "inner_key_layer-11": "random_value",
            "inner_key_layer-112": "random_value",
            "inner_key_layer-12": "random_value",
        },
        "outter_key_layer-13": {
            "inner_key_layer-1": "random_value",
            "inner_key_layer-2": "random_value",
            "inner_key_layer-5": "random_value",
            "inner_key_layer-10": "random_value",
            "inner_key_layer-6": "random_value",
            "inner_key_layer-23": "random_value",
        },
    }
    
    
    def sort_key_func(x: tuple):
        key, _value = x
        splitted = key.split("-")
        splitted[-1] = int(splitted[-1])
        return splitted
    
    
    def sorted_dict(d: dict):
        result = dict(sorted(d.items(), key=sort_key_func))
        for k, v in result.items():
            if isinstance(v, dict):
                result[k] = sorted_dict(v)
    
        return result
    
    
    pprint(sorted_dict(nested_dict), sort_dicts=False)
    

    output:

    {'outter_key_layer-3': {'inner_key_layer-2': 'random_value',
                            'inner_key_layer-4': 'random_value',
                            'inner_key_layer-11': 'random_value',
                            'inner_key_layer-22': 'random_value',
                            'inner_key_layer-31': 'random_value',
                            'inner_key_layer-112': 'random_value'},
     'outter_key_layer-13': {'inner_key_layer-1': 'random_value',
                             'inner_key_layer-2': 'random_value',
                             'inner_key_layer-5': 'random_value',
                             'inner_key_layer-6': 'random_value',
                             'inner_key_layer-10': 'random_value',
                             'inner_key_layer-23': 'random_value'},
     'outter_key_layer-54': {'inner_key_layer-1': 'random_value',
                             'inner_key_layer-2': 'random_value',
                             'inner_key_layer-11': 'random_value',
                             'inner_key_layer-12': 'random_value',
                             'inner_key_layer-112': 'random_value'}}