Explode Dataframe and add new columns with specific values based on a condition

I have a dataframe with 6 columns: 'Name', 'A', 'B', 'C', 'Val', 'Category'

It looks like this:

Name   A     B     C   Val   Category
 x    1.1   0     0.2  NA    NA
 y    0     0.1   0    NA    NA
 z    0.5   0.1   0.3  NA    NA

I want to expand the dataframe such that for each value that is not 0 in columns 'A', 'B', 'C' you get an extra row. The column 'Val' is assigned the non-zero value that led to the expansion and the 'Category' is arbitrarily based on where the value came from.

The result should look like this:

Name   A    B     C    Val   Category
 x    1.1   0     0.2  1.1   first
 x    1.1   0     0.2  0.2   third
 y    0     0.1   0    0.1   second
 z    0.5   0.1   0.3  0.5   fisrt
 z    0.5   0.1   0.3  0.1   second
 z    0.5   0.1   0.3  0.3   third

This is probably the wrong approach, but I thought since I only have three columns I should be repeating all the rows 3 times by using the repeat function on the index and then looping through the rows based on a for loop with a skip to apply 3 functions to assign the target and AICN all rows and then dropping rows where the target is 0.

def targeta(row):
    target = row
    val = 'first'
    return target, val

def targetb(row):
    target = row
    val = 'second'
    return target, val

def targetc(row):
    target = row
    val = 'third'
    return target, val

df_repeat = df.loc[df.index.repeat(3)]

for i in range(1,len(df_repeat)-3,3):
    df_repeat.iloc[i][['Target','Category']]=targeta(df_repeat.iloc[i]['A'])
    df_repeat.iloc[i+1][['Target','Category']]=targetb(df_repeat.iloc[i+1]['B'])
    df_repeat.iloc[i+2][['Target','Category']]=targetc(df_repeat.iloc[i+2]['C'])

I only got to this point and realized I am getting an empty dataframe. Any suggestions on what to do?

Solution

You could replace the 0s with NaNs, rename the columns to your categories, reshape to long with stack, and join back to the original to duplicate the rows:

out = (df
       .drop(columns=['Val', 'Category'])
       .join(df[['A', 'B', 'C']]
             .set_axis(['first', 'second', 'third'], axis=1)
             .rename_axis(columns='Category')
             .replace(0, pd.NA)
             .stack()
             .rename('Val')
             .reset_index(-1)
            )
       )

Output:

  Name    A    B    C Category  Val
0    x  1.1  0.0  0.2    first  1.1
0    x  1.1  0.0  0.2    third  0.2
1    y  0.0  0.1  0.0   second  0.1
2    z  0.5  0.1  0.3    first  0.5
2    z  0.5  0.1  0.3   second  0.1
2    z  0.5  0.1  0.3    third  0.3