Imagine I have a dataframe with these variables and values:
ID | Weight | LR Weight | UR Weight | Age | LS Age | US Age | Height | LS Height | US Height |
---|---|---|---|---|---|---|---|---|---|
1 | 63 | 50 | 80 | 20 | 18 | 21 | 165 | 160 | 175 |
2 | 75 | 50 | 80 | 22 | 18 | 21 | 172 | 160 | 170 |
3 | 49 | 45 | 80 | 17 | 18 | 21 | 180 | 160 | 180 |
I want to create the additional following variables:
ID | Flag_Weight | Flag_Age | Flag_Height |
---|---|---|---|
1 | 1 | 1 | 1 |
2 | 1 | 0 | 0 |
3 | 1 | 0 | 1 |
These flags simbolize that the main variable values (e.g.: Weight, Age and Height) are between the correspondent Lower or Upper limits, which may start with different 2 digits (in this dataframe I gave four examples: LR, UR, LS, US, but in my real dataframe I have more), and whose limit values sometimes differ from ID to ID.
Can you help me create these flags, please?
Thank you in advance.
You can use reshaping using a temporary MultiIndex
:
(df.set_index('ID')
.pipe(lambda d: d.set_axis(pd.MultiIndex.from_frame(
d.columns.str.extract('(^[LU]?).*?\s*(\S+)$')),
axis=1)
)
.stack()
.assign(flag=lambda d: d[''].between(d['L'], d['U']).astype(int))
['flag'].unstack().add_prefix('Flag_').reset_index()
)
Output:
ID Flag_Age Flag_Height Flag_Weight
0 1 1 1 1
1 2 0 0 1
2 3 0 1 1