I have a dataframe that discretize the customers into different Q's, which looks like:
CustomerID_num Q1 Q2 Q3 Q4 Q5 Country
0 12346 1 0 0 0 0 United Kingdom
2 12347 0 0 0 0 1 Iceland
9 12348 0 1 0 0 0 Finland
13 12349 0 0 0 0 1 Italy
14 12350 0 1 0 0 0 Norway
What I want to do is adding a new column, Q, to the dataframe which shows which sector this customer is in, so it looks like:
CustomerID_num Q1 Q2 Q3 Q4 Q5 Q Country
0 12346 1 0 0 0 0 1 United Kingdom
2 12347 0 0 0 0 1 5 Iceland
9 12348 0 1 0 0 0 2 Finland
13 12349 0 0 0 0 1 5 Italy
14 12350 0 1 0 0 0 2 Norway
The only way I can think about is using for loop but it will give me a mess. Any other way to do this?
One option is to dump down into numpy:
Filter for just the Q
columns:
cols = df.filter(like = 'Q')
Get the column positions that are equal to 1:
_, positions = cols.to_numpy().nonzero()
df.assign(Q = positions + 1)
CustomerID_num Q1 Q2 Q3 Q4 Q5 Country Q
0 12346 1 0 0 0 0 United Kingdom 1
2 12347 0 0 0 0 1 Iceland 5
9 12348 0 1 0 0 0 Finland 2
13 12349 0 0 0 0 1 Italy 5
14 12350 0 1 0 0 0 Norway 2