pythonpandasdataframe

Data-frame to show selected results only from columns divisions


A data-frame to divide the month columns by the Total column. At the end, I want to only show the results greater than 35% (0.35).

import pandas as pd

data = {'Name' : ["John","Peter","Kate","Liz"],
'Jan-2016': [408,88,277,32],
'Feb-2016': [313,348,73,302],
'Mar-2016': [497,197,196,164],
'Apr-2016': [88,87,447,118],
'May-2016': [79,20,235,273],
'Total': [1385,740,1228,889]}

df = pd.DataFrame(data)

df_1 = df[["Jan-2016","Feb-2016","Mar-2016","Apr-2016","May-2016"]].div(df["Total"], axis=0)

s

I can only work out the calculation.

What's the way to output the results greater and equal to 0.3 only? (please note 1 person may have 0-3 columns fit).


Solution

  • import pandas as pd
    
    data = {'Name' : ["John","Peter","Kate","Liz"],
    'Jan-2016': [408,88,277,32],
    'Feb-2016': [313,348,73,302],
    'Mar-2016': [497,197,196,164],
    'Apr-2016': [88,87,447,118],
    'May-2016': [79,20,235,273],
    'Total': [1385,740,1228,889]}
    
    df = pd.DataFrame(data)
    
    cols = ["Jan-2016","Feb-2016","Mar-2016","Apr-2016","May-2016"]
    
    df[cols] = df[cols].div(df["Total"], axis=0)
    
    # reshape to use months as a single column
    reshaped = pd.melt(df, id_vars="Name", value_vars=cols)
    
    result = reshaped[reshaped['value'] >= 0.3]
    
    print(result)
    

    Output:

         Name  variable     value
    5   Peter  Feb-2016  0.470270
    7     Liz  Feb-2016  0.339708
    8    John  Mar-2016  0.358845
    14   Kate  Apr-2016  0.364007
    19    Liz  May-2016  0.307087