pythonpython-2.7dictionary

How to break a CSV string into a dictionary of dictionaries (or lists) in Python 2.7?


I have created a Python script that gets data from an API which is returned in a 'text/csv' format. What I'd like to do is use the headers from the CSV file to create a dictionary of dictionaries, or possibly a dictionary of lists depending on which is more effective.

The output is one long string, which I have broken up into a list and then a dictionary with the following code (I've sanitized this a bit):

# Makes API call
resultsreturn = requests.get(url,headers=head)

# Grabs text from API call
# Data is returned in one long string:
# '"Header1,Header2,Header3,Header4\\nR1C1,R1C2,R1C3,R1C4\\nR2C1,R2C2,R2C3,R2C4"'
results_json_data = json.dumps(resultsreturn.text)

# Splits results into list:
# ['"Header1,Header2,Header3,Header4', 'R1C1,R1C2,R1C3,R1C4', 'R2C1,R2C2,R2C3,R2C4"']
list_results_split = results_json_data.split('\\n')

#Splits list into dictionary.
dict_results = dict(zip(range(len(list_results_split)), list_results_split))

Printing dict_results looks like this:

{0: '"Header1,Header2,Header3,Header4', 
 1: 'R1C1,R1C2,R1C3,R1C4'
 2: 'R2C1,R2C2,R2C3,R2C4"'}

So what I'd like to do is somehow get something that looks like this:

{0: {"Header1":"R1C1", "Header2":"R1C2", "Header3":"R1C3", "Header4":"R1C4"}
 1: {"Header1":"R2C1", "Header2":"R2C2", "Header3":"R2C3", "Header4":"R2C4"}
 2: {"Header1":"R3C1", "Header2":"R3C2", "Header3":"R3C3", "Header4":"R3C4"}}

I also just noticed that the first string that gets created from results_json_data has a " at the beginning and a " at the end which I'll probably need to strip out to get everything looking like I want to. Hoping someone can point me in the right direction; I'm fairly new to programming / Python.


Solution

  • Take a look at the csv module and the DictReader class in particular. If possible, you should process csv data using an established library instead of doing it yourself:

    > import csv
    
    # first param must be an iterable producing strings (the lines of your csv data)
    # this tyically is a file-like object, but can be a plain list
    > reader = csv.DictReader(list_results_split, delimiter=',')
    > reader.fieldnames
    ["Header1", "Header2", "Header3"]
    
    > lst = list(reader)
    [{"Header1":"R1C1", "Header2":"R1C2", "Header3":"R1C3", "Header4":"R1C4"},
     {"Header1":"R2C1", "Header2":"R2C2", "Header3":"R2C3", "Header4":"R2C4"},
     {"Header1":"R3C1", "Header2":"R3C2", "Header3":"R3C3", "Header4":"R3C4"}]
    
    # And
    > dict(enumerate(lst))
    {0: {"Header1":"R1C1", "Header2":"R1C2", "Header3":"R1C3", "Header4":"R1C4"}
     1: {"Header1":"R2C1", "Header2":"R2C2", "Header3":"R2C3", "Header4":"R2C4"}
     2: {"Header1":"R3C1", "Header2":"R3C2", "Header3":"R3C3", "Header4":"R3C4"}}
    

    Looking at your original string and your output, you should consider stripping the " before processing:

    results_json_data = results_json_data.strip('"')