I made three conditional selections on my dataframe. So lets say:
final_df[(final_df['acceptance_advice'] == 'standard') & (final_df['acceptance'] == 'ok')]
final_df[(final_df['acceptance_advice'] == 'not accepted') & (final_df['acceptance'] == 'ok')]
final_df[(final_df['acceptance_advice'] == 'postponed') & (final_df['acceptance'] == 'declined')]
Now I want to add a categorical variable (the class I am going to use for prediction) from each of these selections. So let's say: the first selection should be class 1 and the second should class 2 and the third selection should be class 3.
I have tried:
cat_1 = final_df[(final_df['acceptance_advice'] == 'standard') & (final_df['acceptance'] == 'ok')]
cat_2 = final_df[(final_df['acceptance_advice'] == 'not accepted') & (final_df['acceptance'] == 'ok')]
cat_3 = final_df[(final_df['acceptance_advice'] == 'postponed') & (final_df['acceptance'] == 'declined')]
final_df['class'] = (cat_1 | cat_2 | cat_3).astype(int)
But it only worked on two categories (e.g. 0 and 1) but not on three.
final_df looks something like this:
id | feature1 | feature2 | acceptance_advice | acceptance |
---|---|---|---|---|
some value | some value | some value | some value | some value |
some value | some value | some value | some value | some value |
some value | some value | some value | some value | some value |
some value | some value | some value | some value | some value |
I want it to look like this:
id | feature1 | feature2 | acceptance_advice | acceptance | class |
---|---|---|---|---|---|
some value | some value | some value | some value | some value | 1 |
some value | some value | some value | some value | some value | 2 |
some value | some value | some value | some value | some value | 1 |
some value | some value | some value | some value | some value | 3 |
I want to add a column class, which should be the class to be predicted.
You can test the following to add a class column -
def set_class(df):
if (df['acceptance_advice'] == 'standard') & (df['acceptance'] == 'ok'):
return "1"
elif (df['acceptance_advice'] == 'not accepted') & (df['acceptance'] == 'ok'):
return "2"
elif (df['acceptance_advice'] == 'postponed') & (df['acceptance'] == 'declined'):
return "3"
df['class'] = df.apply(set_class, axis = 1)