pythonpandasdataframepivot-tableunpivot

Pandas stack() if columns have a specific value


I am trying to stack this table based on ID column but only considering columns [A-D] where the value is 1 and not 0.

Current df:

ID A B C D
1 1 0 0 1
3 0 1 0 1
7 1 0 1 1
8 1 0 0 0

What I want:

ID LETTER
1 A
1 D
3 B
3 D
7 A
7 C
7 D
8 A

The following code works but I need a more efficient solution as I have a df with 93434 rows x 12377 columns.

stacked_df = df.set_index('ID').stack().reset_index(name='has_letter').rename(columns={'level_1':'LETTER'})
stacked_df = stacked_df[stacked_df['has_letter']==1].reset_index(drop=True)
stacked_df.drop(['has_letter'], axis=1, inplace=True)

Solution

  • Try:

    print(
        df.set_index("ID")
        .apply(lambda x: x.index[x == 1], axis=1)
        .reset_index()
        .explode(0)
        .rename(columns={0: "LETTERS"})
    )
    

    Prints:

       ID LETTERS
    0   1       A
    0   1       D
    1   3       B
    1   3       D
    2   7       A
    2   7       C
    2   7       D
    3   8       A
    

    Or:

    x = df.set_index("ID").stack()
    print(
        x[x == 1]
        .reset_index()
        .drop(columns=0)
        .rename(columns={"level_1": "LETTER"})
    )
    

    Prints:

       ID LETTER
    0   1      A
    1   1      D
    2   3      B
    3   3      D
    4   7      A
    5   7      C
    6   7      D
    7   8      A