I have a Dataframe, which has two columns (Customer, Transactions). The Transactions column is a tuple of all the transaction id's of that customer.
Customer Transactions
1 (a,b,c)
2 (d,e)
I want to convert this into a dataframe, which has customer and transaction id's, like this.
Customer Transactions
1 a
1 b
1 c
2 d
2 e
We can do it using loops, but is there a straight 1 or 2 lines way for doing that.
You can use DataFrame
constructor:
df = pd.DataFrame({'Customer':[1,2],
'Transactions':[('a','b','c'),('d','e')]})
print (df)
Customer Transactions
0 1 (a, b, c)
1 2 (d, e)
df1 = pd.DataFrame(df.Transactions.values.tolist(), index=df.Customer)
print (df1)
0 1 2
Customer
1 a b c
2 d e None
Then reshape with stack
:
print (df1.stack().reset_index(drop=True, level=1).reset_index(name='Transactions'))
Customer Transactions
0 1 a
1 1 b
2 1 c
3 2 d
4 2 e