I have a dataset given as such in Python:
#Load the required libraries
import pandas as pd
#Create dataset
data = {'ID': [1, 1, 1, 1, 1,1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3],
'Salary': [1, 2, 3, 4, 5,6,7,8,9,10, 1, 2, 3,4,5,6, 1, 2, 3, 4,5,6,7,8],
'Children': ['No', 'Yes', 'Yes', 'Yes', 'No','No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes', 'Yes', 'No'],
'Days': [123, 128, 66, 120, 141,123, 128, 66, 120, 141, 52,96, 120, 141, 52,96, 120, 141,123,15,85,36,58,89],
}
#Convert to dataframe
df = pd.DataFrame(data)
print("df = \n", df)
The above dataframe looks as such :
Now, for every ID/group, I wish to set an upperbound for some value of 'Salary'.
For example,
For ID=1, the upperbound of 'Salary' should be set at 4
For ID=2, the upperbound of 'Salary' should be set at 3
For ID=3, the upperbound of 'Salary' should be set at 5
The net result needs to look as such:
Can somebody please let me know how to achieve this task in python?
Use custom function with mapping by helper dictionary in GroupBy.transform
:
d = {1:4, 2:3, 3:5}
def f(x):
x.iloc[:d[x.name]] = d[x.name]
return x
df['Salary'] = df.groupby('ID')['Salary'].transform(f)
print (df)
ID Salary Children Days
0 1 4 No 123
1 1 4 Yes 128
2 1 4 Yes 66
3 1 4 Yes 120
4 1 5 No 141
5 1 6 No 123
6 1 7 Yes 128
7 1 8 Yes 66
8 1 9 Yes 120
9 1 10 No 141
10 2 3 Yes 52
11 2 3 Yes 96
12 2 3 No 120
13 2 4 Yes 141
14 2 5 Yes 52
15 2 6 Yes 96
16 3 5 Yes 120
17 3 5 Yes 141
18 3 5 No 123
19 3 5 Yes 15
20 3 5 No 85
21 3 6 Yes 36
22 3 7 Yes 58
23 3 8 No 89
Another idea is use GroupBy.cumcount
for counter per ID
, compared by mapped ID
and if match set mapped Series
by Series.mask
:
d = {1:4, 2:3, 3:5}
s = df['ID'].map(d)
df['Salary'] = df['Salary'].mask(df.groupby('ID').cumcount().lt(s), s)
Or if counter column is in Salary
is possible use:
s = df['ID'].map(d)
df['Salary'] = df['Salary'].mask(df['Salary'].le(s), s)
print (df)
ID Salary Children Days
0 1 4 No 123
1 1 4 Yes 128
2 1 4 Yes 66
3 1 4 Yes 120
4 1 5 No 141
5 1 6 No 123
6 1 7 Yes 128
7 1 8 Yes 66
8 1 9 Yes 120
9 1 10 No 141
10 2 3 Yes 52
11 2 3 Yes 96
12 2 3 No 120
13 2 4 Yes 141
14 2 5 Yes 52
15 2 6 Yes 96
16 3 5 Yes 120
17 3 5 Yes 141
18 3 5 No 123
19 3 5 Yes 15
20 3 5 No 85
21 3 6 Yes 36
22 3 7 Yes 58
23 3 8 No 89