pythonpandasrecursion

extract list from cell and add new rows to a df


Please find below an example of a small and simple df I've been working on. I've been struggling to get rid off the lists from cells and replacae them accordingly.

col1 col2 col3
[a,b] not a list [1,2,3]
not a list [@, $] not a list
not a list not a list not a list

Lists can be placed randomly in my df. For a row, they may be in each column or in none of them. I need to extract every list from a cell, multiplicate relevant rows so that they become a combination of list elements. That is what I'd like to get as a final result.

col1 col2 col3
a not a list 1
a not a list 2
a not a list 3
b not a list 1
b not a list 2
b not a list 3
not a list @ not a list
not a list $ not a list
not a list not a list not a list

I figured out the best way for making such modification is some kind of recursion. Unfortunatelly I've been strugling with such implementation for quite some time. I'd be very thankfull for help and inspiration.


Solution

  • Assuming actual lists like this:

    df = pd.DataFrame({'col1': [['a', 'b'], 'not a list', 'not a list'],
                       'col2': ['not a list', ['@', '$'], 'not a list'],
                       'col3': [[1,2,3], 'not a list', 'not a list']})
    

    You could explode all columns successively to generate a cartesian product:

    out = df.explode('col1').explode('col2').explode('col3')
    

    Or using functools.reduce to handle all columns programmatically:

    from functools import reduce
    
    out = reduce(lambda x, c: x.explode(c), list(df), df)
    

    Output:

             col1        col2        col3
    0           a  not a list           1
    0           a  not a list           2
    0           a  not a list           3
    0           b  not a list           1
    0           b  not a list           2
    0           b  not a list           3
    1  not a list           @  not a list
    1  not a list           $  not a list
    2  not a list  not a list  not a list