pythondatabaselistsqlitedictionary

Is there a natural way to load the data of a sqlite3 table as a dictionary of lists rather than a list of dictionaries?


The following table my_table

idx   speed   duration
 0     420       50
 1     380       40
 2     390       45

is stored in a database file my_db.db. Loading this table as a list of dictionaries appears straightforward: the code

conn = sqlite3.connect(folder_path + 'my_db.db')
conn.row_factory = sqlite3.Row
cur = conn.cursor()
cur.execute(f'SELECT * FROM my_table')
list_of_objs = cur.fetchall()
list_of_dicts = [dict(obj) for obj in list_of_objs]
cur.close()
conn.close()

yields

list_of_dicts = [
  {'idx': 0, 'speed': 420, 'duration': 50},
  {'idx': 0, 'speed': 380, 'duration': 40},
  {'idx': 0, 'speed': 390, 'duration': 45}
]

However, for my data, I would like to load the data as a dictionary of lists:

dict_of_lists = [
  'idx': [0, 1, 2],
  'speed': [420, 380, 390],
  'duration': [50, 40, 45]
]

Is there a straightforward way to do this?

(Part of my motivation for the dictionary-of-lists representation is that it seems like it would require less RAM than a list of dictionaries when the table has many rows. But does it, or does Python somehow recognize and avoid the redundancy of keys in the list-of-dictionaries representation?)


Solution

  • Similar to the previous answer, using fetchmany(), but several times faster; just the juicy parts:

    data: dict[str, list] = collections.defaultdict(list)
    while chunk := cursor.fetchmany(CHUNK_SIZE):
        for values, col_name in zip(zip(*chunk), columns):
            data[col_name].extend(values)
    

    This is roughly 5x times faster, as it avoids repeated dictionary lookups for every single column in every single row, and uses more dense iterators.