I have large dataframe of the following type:
Part Text
P1 a
P1 b
P2 c
P2 d
I want to transform this data into two formats
Expected output 1:
Part Text
P1 ['a','b']
P2 ['c','d']
Expected output 2:
Part Text
P1 ['ab']
P2 ['cd']
Can anyone help me with this
Use groupby.agg
:
df1 = df.groupby('Part')['Text'].agg(list).reset_index()
df2 = df.groupby('Part')['Text'].agg(lambda x: [''.join(x.values)]).reset_index()
print(df1)
Part Text
0 P1 [a, b]
1 P2 [c, d]
print(df2)
Part Text
0 P1 [ab]
1 P2 [cd]