pythonpandasdataframetuplesreshape

Getting a tuple in a Dafaframe into multiple rows


I have a Dataframe, which has two columns (Customer, Transactions). The Transactions column is a tuple of all the transaction id's of that customer.

Customer Transactions
1        (a,b,c)
2        (d,e)

I want to convert this into a dataframe, which has customer and transaction id's, like this.

Customer  Transactions
1         a
1         b
1         c
2         d
2         e

We can do it using loops, but is there a straight 1 or 2 lines way for doing that.


Solution

  • You can use DataFrame constructor:

    df = pd.DataFrame({'Customer':[1,2],
                       'Transactions':[('a','b','c'),('d','e')]})
    
    print (df)
       Customer Transactions
    0         1    (a, b, c)
    1         2       (d, e)
    
    df1 = pd.DataFrame(df.Transactions.values.tolist(), index=df.Customer)
    print (df1)
              0  1     2
    Customer            
    1         a  b     c
    2         d  e  None
    

    Then reshape with stack:

    print (df1.stack().reset_index(drop=True, level=1).reset_index(name='Transactions'))
       Customer Transactions
    0         1            a
    1         1            b
    2         1            c
    3         2            d
    4         2            e