pythonpandasrelational

Pandas select top 3 and append from another table


Hi I would like to take the top 3 numbers for each person and and append the reason_comments into them. If there is a tie, I would like to just take the first one.

May i know how can i do this in python please?

Table 1:
      id    VarA    VarB    VarC    VarD    VarE
        1   5        4       3       2       1
        2   4        6      21       5       5
        3   3        8      6        9       0
        4   7        8      23      44       0

Table 2: 
    reason_code reason_comment
    VarA        A is high
    VarB        B is high
    VarC        C is high
    VarD        D is high
    VarE        E is high


Results:
id  reason 1    reason 2    reason 3
1   A is high   B is high   C is high
2   C is high   B is high   D is high
3   D is high   B is high   C is high
4   D is high   C is high   B is high

Solution

  • There are possible ties, so is necessary remove them. So you can reshape DataFrame by DataFrame.melt, sorting by DataFrame.sort_values and remove duplicated by DataFrame.drop_duplicates.

    df1 = (df.melt('id')
            .sort_values(['id','value'], ascending=[True, False])
            .drop_duplicates(['id','value']))
    

    Then for filter top3 is used GroupBy.cumcount for possible reuse it for new column names in DataFrame.pivot:

    df1['g'] = df1.groupby('id').cumcount().add(1) 
        
    df1 = df1[df1['g'].le(3)]
    

    Also use Series.map for data by another DataFrame:

    s = df2.set_index('reason_code')['reason_comment']
    df1['variable'] = df1['variable'].map(s)
    
    df1 = df1.pivot('id','g','variable').add_prefix('reason')
    
    print (df)
    
    g     reason1    reason2    reason3
    id                                 
    1   A is high  B is high  C is high
    2   C is high  B is high  D is high
    3   D is high  B is high  C is high
    4   D is high  C is high  B is high
    

    For convert id to column and remove g use:

    df1 = df1.reset_index().rename_axis(None, axis=1)