pythondictionarypython-3.7python-collections

Python 3.9.5: One dictionary assignment is overwriting multiple keys [BUG?]


I am reading a .csv called courses. Each row corresponds to a course which has an id, a name, and a teacher. They are to be stored in a Dict. An example:

list_courses = { 
    1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'},
    ... 
 }

While iterating the rows using enumerate(file_csv.readlines()) I am performing the following:

list_courses={}

for idx, row in enumerate(file_csv.readlines()):
                # Skip blank rows.
                if row.isspace(): continue
                
                # If we're using the row, turn it into a list.
                row = row.strip().split(",")

                # If it's the header row, take note of the header. Use these values for the dictionaries' keys.
                # As of 3.7 a Dict remembers the order in which the keys were inserted.
                # Since the order is constant, simply load each other row into the corresponding key.           
                if not idx: 
                    sheet_item = dict.fromkeys(row)
                    continue
                
                # Loop through the keys in sheet_item. Assign the value found in the row, converting to int where necessary.
                for idx, key in enumerate(list(sheet_item)):
                    sheet_item[key] = int(row[idx].strip()) if key == 'id' or key == 'mark' else row[idx].strip()


                # Course list
                print("ADDING COURSE WITH ID {} TO THE DICTIONARY:".format(sheet_item['id']))
                list_courses[sheet_item['id']] = sheet_item
                print("\tADDED: {}".format(sheet_item))
                print("\tDICT : {}".format(list_courses))

Thus, the list_courses dictionary is printed after each sheet_item is added to it.

Now comes the issue - when reading in two courses, I expect that list_courses should read:

list_courses = { 
    1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'},
    2: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}
 }

However, the output of my print statements (substantiated by errors later in my program) is:

ADDING COURSE WITH ID 1 TO THE DICTIONARY:
        ADDED: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'}
        DICT : {1: {'id': 1, 'name': 'Biology', 'teacher': 'Mr. D'}}
ADDING COURSE WITH ID 2 TO THE DICTIONARY:
        ADDED: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}
        DICT : {1: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}, 2: {'id': 2, 'name': 'History', 'teacher': 'Mrs. P'}}

Thus, the id with which the sheet_item is being added to courses_list is correct (1 or 2), however the assignment which occurs for the second course appears to be overwriting the value for key 1. I'm not even sure how this is possible. Please let me know your thoughts.


Solution

  • You're using the same dictionary for both the header and all the rows. You never create any new dictionaries after the header. Key assignments are overwriting previous ones, because there are no new dictionaries to write to.

    Store the keys in a list, and make a new sheet_item before the for loop:

    list_courses={}
    keys = None # Let Python know this is defined
    
    for idx, row in enumerate(file_csv.readlines()):
                    # Skip blank rows.
                    if row.isspace(): continue
                    
                    # If we're using the row, turn it into a list.
                    row = row.strip().split(",")
    
                    # If it's the header row, take note of the header. Use these values for the dictionaries' keys.
                    # As of 3.7 a Dict remembers the order in which the keys were inserted.
                    # Since the order is constant, simply load each other row into the corresponding key.           
                    if not idx: 
                        keys = row
                        continue
                    
                    sheet_item = {}
                    # Loop through the keys in sheet_item. Assign the value found in the row, converting to int where necessary.
                    for idx, key in enumerate(keys):
                        sheet_item[key] = int(row[idx].strip()) if key == 'id' or key == 'mark' else row[idx].strip()
    
    
                    # Course list
                    print("ADDING COURSE WITH ID {} TO THE DICTIONARY:".format(sheet_item['id']))
                    list_courses[sheet_item['id']] = sheet_item
                    print("\tADDED: {}".format(sheet_item))
                    print("\tDICT : {}".format(list_courses))