pythonpandasdataframebetween

How to call a column by combining a string and another variable in a python dataframe?


Imagine I have a dataframe with these variables and values:

ID Weight LR Weight UR Weight Age LS Age US Age Height LS Height US Height
1 63 50 80 20 18 21 165 160 175
2 75 50 80 22 18 21 172 160 170
3 49 45 80 17 18 21 180 160 180

I want to create the additional following variables:

ID Flag_Weight Flag_Age Flag_Height
1 1 1 1
2 1 0 0
3 1 0 1

These flags simbolize that the main variable values (e.g.: Weight, Age and Height) are between the correspondent Lower or Upper limits, which may start with different 2 digits (in this dataframe I gave four examples: LR, UR, LS, US, but in my real dataframe I have more), and whose limit values sometimes differ from ID to ID.

Can you help me create these flags, please?

Thank you in advance.


Solution

  • You can use reshaping using a temporary MultiIndex:

    (df.set_index('ID')
       .pipe(lambda d: d.set_axis(pd.MultiIndex.from_frame(
                         d.columns.str.extract('(^[LU]?).*?\s*(\S+)$')),
                                  axis=1)
            )
       .stack()
       .assign(flag=lambda d: d[''].between(d['L'], d['U']).astype(int))
       ['flag'].unstack().add_prefix('Flag_').reset_index()
    )
    

    Output:

       ID  Flag_Age  Flag_Height  Flag_Weight
    0   1         1            1            1
    1   2         0            0            1
    2   3         0            1            1