pythonpython-regex

convert string which contains sub string to dictionary


I am tring to convert particular strings which are in particular format to Python dictionary. String format is like below,

st1 = 'key1 key2=value2 key3="key3.1, key3.2=value3.2 , key3.3 = value3.3, key3.4" key4'

I want to parse it and convert to dictionary as below,

dict1 {
    key1: None,
    key2: value2,
    key3: {
            key3.1: None,
            key3.2: value3.2,
            key3.3: value3.3,
            key3.2: None
          }
    key4: None,

I tried to use python re package and string split function. not able to acheive the result. I have thousands of string in same format, I am trying to automate it. could someone help.


Solution

  • If all your strings are consistent, and only have 1 layer of sub dict, this code below should do the trick, you may need to make tweaks/changes to it.

    import json
    
    st1 = 'key1 key2=item2 key3="key3.1, key3.2=item3.2 , key3.3 = item3.3, key3.4" key4'
    st1 = st1.replace(' = ', '=')
    st1 = st1.replace(' ,', ',')
    new_dict = {}
    no_keys=False
    
    while not no_keys:
        st1 = st1.lstrip()
        
        if " " in st1:
            item = st1.split(" ")[0]
        else:
            item = st1
        
        if '=' in item:
            if '="' in item:
                item = item.split('=')[0]
                new_dict[item] = {}     
                
                st1 = st1.replace(f'{item}=','')
                sub_items = st1.split('"')[1]
                sub_values = sub_items.split(',')
    
                for sub_item in sub_values:
                    if "=" in sub_item:
                        sub_key, sub_value = sub_item.split('=')
                        new_dict[item].update({sub_key.strip():sub_value.strip()})
                    else:
                        new_dict[item].update({sub_item.strip(): None})
                
                st1 = st1.replace(f'"{sub_items}"', '')
            else:
                key, value = item.split('=')
                new_dict.update({key:value})
                st1 = st1.replace(f"{item} ","")
        else:
            new_dict.update({item: None})
            st1 = st1.replace(f"{item}","")
            
        if st1 == "":
            no_keys=True    
        
    print(json.dumps(new_dict, indent=4))