I have below CSV file
timestamp,store,customer_name,basket_items,total_price,cash_or_card,card_number
06/06/2022 09:00,Chesterfield,Stephanie Neyhart,"Large Flat white - 2.45, Large Flavoured iced latte - Vanilla - 3.25, Large Flavoured iced latte - Hazelnut - 3.25",8.95,CASH,
I want to split basket_items
like
product price
Large Flat white 2.45
Large Flavoured iced latte - Vanilla 3.25
Large Flavoured iced latte - Hazelnut 3.25
How can I do that with pandas dataframe?
We simply need to split that column twice; once on comma and explode it; that would put each item in its own row. Then split again in ' - '
to separate the product name and its price into separate columns.
#data
df = pd.DataFrame([{
'timestamp': '06/06/2022 09:00',
'store': 'Chesterfield',
'customer_name': 'Stephanie Neyhart',
'basket_items': "Large Flat white - 2.45, Large Flavoured iced latte - Vanilla - 3.25, Large Flavoured iced latte - Hazelnut - 3.25",
'total_price': 8.95,
'cash_or_card': 'CASH'}])
# split by comma and explode (to separate products into multi-rows)
# split by dash once from the right side to separate product from price
res = (
df['basket_items'].str.split(', ')
.explode()
.str.rsplit(' - ', n=1, expand=True)
)
# set column names
res.columns = ['product', 'price']