I am trying to convert the following column into new rows:
Id | Prices |
---|---|
001 | ["March:59", "April:64", "May:62"] |
002 | ["Jan:55", ETC] |
to
id | date | price |
---|---|---|
001 | March | 59 |
001 | April | 64 |
001 | May | 62 |
002 | Jan | 55 |
The date:price pairs aren't stored in a traditional dictionary format like the following solution:
Convert dictionary keys to rows and show all the values in single column using Pandas
I managed to get the key:value pairs into individual rows like:
Id | Prices |
---|---|
001 | March:59 |
001 | April:64 |
And could split these into two columns using string manipulation but this feels inefficient instead of actually using the key:value pairs. Can anyone help please?
If you have valid lists, explode
and split
:
df = pd.DataFrame({'Id': ['001', '002'],
'Prices': [["March:59", "April:64", "May:62"], ["Jan:55"]]})
out = df.explode('Prices')
out[['date', 'price']] = out.pop('Prices').str.split(':', expand=True)
If you have strings, str.extractall
with a regex and join
:
df = pd.DataFrame({'Id': ['001', '002'],
'Prices': ['["March:59", "April:64", "May:62"]', '["Jan:55"]']})
out = (df.drop(columns='Prices')
.join(df['Prices'].str.extractall(r'(?P<date>[^":]+):(?P<price>[^":]+)')
.droplevel('match'))
)
Output:
Id date price
0 001 March 59
0 001 April 64
0 001 May 62
1 002 Jan 55
regex demo for the second approach.