pythonpandasdataframeaggregatedata-munging

pandas iterate over many dataframe and create list of occurance per key


I have the few hundreds of dataframe with the same sturcture. I want to aggregate per key as follows: For the list columns - create a list of lists (where each list is the value of specific dataframe) For example in case of 2 datarames:

df1 =  
Key  C1.  C2. C3
 A  [1,2]  6   b
 B  [6,1]  9   c

df2 = 
Key  C1   C2   C3
 B  [5,8]  2    t
 A  [7,2]  3    z

df_agg =  
   Key      C1          C2        C3 
    A  [[1,2],[7,2]]   [6,3]    [b,z]  
    B  [[6,1],[5,8]]   [9,2]    [c,t]

Please notice I have few hundreds: df1, df2, ... dfn


Solution

  • EDIT: Solution for list of DataFrames - use concat and aggregate list:

    dfs = [df1, df2]
    
    df = pd.concat(dfs).groupby('Key', as_index=False).agg(list)
    print (df)
      Key                C1      C2      C3
    0   A  [[1, 2], [7, 2]]  [6, 3]  [b, z]
    1   B  [[6, 1], [5, 8]]  [9, 2]  [c, t]