A data-frame to divide the month columns by the Total column. At the end, I want to only show the results greater than 35% (0.35).
import pandas as pd
data = {'Name' : ["John","Peter","Kate","Liz"],
'Jan-2016': [408,88,277,32],
'Feb-2016': [313,348,73,302],
'Mar-2016': [497,197,196,164],
'Apr-2016': [88,87,447,118],
'May-2016': [79,20,235,273],
'Total': [1385,740,1228,889]}
df = pd.DataFrame(data)
df_1 = df[["Jan-2016","Feb-2016","Mar-2016","Apr-2016","May-2016"]].div(df["Total"], axis=0)
I can only work out the calculation.
What's the way to output the results greater and equal to 0.3 only? (please note 1 person may have 0-3 columns fit).
import pandas as pd
data = {'Name' : ["John","Peter","Kate","Liz"],
'Jan-2016': [408,88,277,32],
'Feb-2016': [313,348,73,302],
'Mar-2016': [497,197,196,164],
'Apr-2016': [88,87,447,118],
'May-2016': [79,20,235,273],
'Total': [1385,740,1228,889]}
df = pd.DataFrame(data)
cols = ["Jan-2016","Feb-2016","Mar-2016","Apr-2016","May-2016"]
df[cols] = df[cols].div(df["Total"], axis=0)
# reshape to use months as a single column
reshaped = pd.melt(df, id_vars="Name", value_vars=cols)
result = reshaped[reshaped['value'] >= 0.3]
print(result)
Output:
Name variable value
5 Peter Feb-2016 0.470270
7 Liz Feb-2016 0.339708
8 John Mar-2016 0.358845
14 Kate Apr-2016 0.364007
19 Liz May-2016 0.307087