Please find below an example of a small and simple df I've been working on. I've been struggling to get rid off the lists from cells and replacae them accordingly.
col1 | col2 | col3 |
---|---|---|
[a,b] | not a list | [1,2,3] |
not a list | [@, $] | not a list |
not a list | not a list | not a list |
Lists can be placed randomly in my df. For a row, they may be in each column or in none of them. I need to extract every list from a cell, multiplicate relevant rows so that they become a combination of list elements. That is what I'd like to get as a final result.
col1 | col2 | col3 |
---|---|---|
a | not a list | 1 |
a | not a list | 2 |
a | not a list | 3 |
b | not a list | 1 |
b | not a list | 2 |
b | not a list | 3 |
not a list | @ | not a list |
not a list | $ | not a list |
not a list | not a list | not a list |
I figured out the best way for making such modification is some kind of recursion. Unfortunatelly I've been strugling with such implementation for quite some time. I'd be very thankfull for help and inspiration.
Assuming actual lists like this:
df = pd.DataFrame({'col1': [['a', 'b'], 'not a list', 'not a list'],
'col2': ['not a list', ['@', '$'], 'not a list'],
'col3': [[1,2,3], 'not a list', 'not a list']})
You could explode
all columns successively to generate a cartesian product:
out = df.explode('col1').explode('col2').explode('col3')
Or using functools.reduce
to handle all columns programmatically:
from functools import reduce
out = reduce(lambda x, c: x.explode(c), list(df), df)
Output:
col1 col2 col3
0 a not a list 1
0 a not a list 2
0 a not a list 3
0 b not a list 1
0 b not a list 2
0 b not a list 3
1 not a list @ not a list
1 not a list $ not a list
2 not a list not a list not a list