[SOLVED] How to get the first column that satisfies condition

How to get the first column that satisfies condition

I have a dataset like the below which was got by traversing through a hierarchy:

id	name	rank	manager_id	manager_id_1	manager_id_2	...	manager_id_20

There is not a set number of levels, so the data might well go to manager_id_25 in future.

For every row, I want to find the first column from manager_id onwards which satisfies a condition. The condition is to check if a specific substring is present in the column.

For example, if the row data is:

id	name	rank	manager_id	manager_id_1	manager_id_2	manager_id_3	...	result
xx1	namexx	T	xx2-TR	xx3-FR	xx4-FR	xx22-ER	...	xx3-FR

and I want the first column that contains "FR" in the data, then "xx3-FR" has to be populated in the result column.

Solution

First, find all columns that are relevant, i.e. start manager_id:

manager_columns = [col for col in df.columns if col.startswith('manager_id')]

Now find your substring within your selection and return the first column value that satisfies the condition:

df['result'] = df[manager_columns].apply(lambda row: next((val for val in row if 'FR' in val), None), axis=1)