I have weather data in a python dict that I'm trying to convert to a pandas df. (From there I'll load it into SQLServer but I've got that part working)
my_dict = {
'data.outdoor.temperature': {'unit': '℃', 'list':
{'datetime.datetime(2025, 4, 23, 10, 0): 22.3',
'datetime.datetime(2025, 4, 23, 14, 0): 21.3',
'datetime.datetime(2025, 4, 23, 18, 0): 18.2', }},
'data.indoor.temperature': {'unit': '℃', 'list':
{'datetime.datetime(2025, 4, 23, 10, 0): 23.2',
'datetime.datetime(2025, 4, 23, 14, 0): 23.5',
'datetime.datetime(2025, 4, 23, 18, 0): 22.9'}}}
The output I'm after is:
outdoor.temperature | 2025_04_20 | 14 | 28.1 | ℃
outdoor.temperature | 2025_04_20 | 18 | 23.8 | ℃
.....etc
indoor.temperature | 2025_04_20 | 14 | 23.5 | ℃
indoor.temperature | 2025_04_20 | 18 | 23.8 | ℃
.....etc
The list of datetime
& temperatures
could be 100s of records
I've tried from_dict
, concat
/ unstack
, normalize
, flatten
and combinations of the lot but I'm beaten
I changed your dictionary constructor to construct date time objects rather than have a set of strings with form datetime.datetime(#, #, #, #, #): ##.#
, as I interpreted it to be in dictionary form:
my_dict1 = {'data.outdoor.temperature': {'unit': '℃', 'list': {datetime.datetime(2025, 4, 23, 10, 0): '22.3', datetime.datetime(2025, 4, 23, 14, 0): '21.3', datetime.datetime(2025, 4, 23, 18, 0): '18.2', }}, 'data.indoor.temperature': {'unit': '℃', 'list': {datetime.datetime(2025, 4, 23, 10, 0): '23.2', datetime.datetime(2025, 4, 23, 14, 0): '23.5', datetime.datetime(2025, 4, 23, 18, 0): '22.9'}}}
Afterwards, I constructed a dictionary representation, then appended the unit
column as an index, turned the list column into a Series (unpacking it), before stacking the whole thing and resetting the index.
pd.DataFrame.from_dict(my_dict1, orient='index').set_index('unit', append=True)['list'].apply(pd.Series).stack().reset_index()
This yielded the following:
level_0 unit level_2 0
0 data.outdoor.temperature ℃ 2025-04-23 10:00:00 22.3
1 data.outdoor.temperature ℃ 2025-04-23 14:00:00 21.3
2 data.outdoor.temperature ℃ 2025-04-23 18:00:00 18.2
3 data.indoor.temperature ℃ 2025-04-23 10:00:00 23.2
4 data.indoor.temperature ℃ 2025-04-23 14:00:00 23.5
5 data.indoor.temperature ℃ 2025-04-23 18:00:00 22.9
(It might appear misaligned because the ℃ character is wider than the standard monospaced character.) You can then rename columns by assigning to df.columns
. I chose df.columns = ['where', 'unit', 'date', 'temp']
.
You can clean up where
pretty simply: df['where'] = df['where'].str.replace(r'^data\.', '', regex=True)
.
The constructor also gives temp
as a string, I would do df['temp'] = df['temp'].pipe(pd.to_numeric)
to make it numeric.