pythonstringlisttext-analytics-api

how do I convert rows in a dataframe into multiple lists in Python


I have large dataframe of the following type:

Part    Text
P1  a
P1  b
P2  c
P2  d

I want to transform this data into two formats

Expected output 1:

Part    Text
P1  ['a','b']
P2  ['c','d']

Expected output 2:

Part    Text
P1  ['ab']
P2  ['cd']

Can anyone help me with this


Solution

  • Use groupby.agg:

    df1 = df.groupby('Part')['Text'].agg(list).reset_index()
    df2 = df.groupby('Part')['Text'].agg(lambda x: [''.join(x.values)]).reset_index()
    
    print(df1)
      Part    Text
    0   P1  [a, b]
    1   P2  [c, d]
    
    print(df2)
      Part  Text
    0   P1  [ab]
    1   P2  [cd]