I have the following code:
input= pd.DataFrame({'Police District Name': ['WHEATON', 'SILVER SPRING', 'BETHESDA','GERMANTOWN','WHEATON','MONTGOMERY VILLAGE'],
'cn1': ['Crime Against Person', 'Crime Against Person', 'Crime Against Person','other','other','other'],
'cn2': ['Aggravated Assault', 'bla', 'bla','blaa','bla','one more bla'],
'cn3': ['Aggravated Assault', 'bla', 'bla','blaa','bla','one more bla'],
})
input
Desired output:
output= pd.DataFrame({'Police District Name': ['WHEATON', 'SILVER SPRING', 'BETHESDA','GERMANTOWN','WHEATON','MONTGOMERY VILLAGE'],
'total crime number':[6,3,3,3,6,3],
})
output
How can i get this ?. Thank you!
If each value in cn1
, cn2
is filled with a crime, you can use the number of columns. The idea is to construct a series of counts via value_counts
and multiply by the number of cnx
columns. Then map to your dataframe.
counts = df['Police District Name'].value_counts() * (len(df.columns) - 1)
df['total crime number'] = df['Police District Name'].map(counts)
print(df[['Police District Name', 'total crime number']])
Police District Name total crime number
0 WHEATON 6
1 SILVER SPRING 3
2 BETHESDA 3
3 GERMANTOWN 3
4 WHEATON 6
5 MONTGOMERY VILLAGE 3