[SOLVED] How to combine pandas df so that rows with permuted col1 and col2 values can be merged containing only one combination & summing a count column

How to combine pandas df so that rows with permuted col1 and col2 values can be merged containing only one combination & summing a count column

I want to combine the permuted col1 and col2 values into one row. containing only the first combination & summing the count column of both. Is there an easy way to do this in pandas?

The example data frame and output: For example, in the below data frame I want to combine rows with values A, B, and B, A and sum their count column. Same for rows with values C, D, and D, C and sum their count values as well. I want to keep the rest of the rows in the data frame as is.

INPUT:

col1	col2	count
A	B	3
C	D	2
B	A	5
E	F	2
G	H	8
D	C	5
I	J	4

OUTPUT:

col1	col2	count
A	B	8
C	D	7
E	F	2
G	H	8
I	J	4

Solution

You can .groupby according sorted col1/col2:

x = (
    df.groupby(df[["col1", "col2"]].apply(lambda x: tuple(sorted(x)), 1))
    .agg({"col1": "first", "col2": "first", "count": "sum"})
    .reset_index(drop=True)
)
print(x)

Prints:

  col1 col2  count
0    A    B      8
1    C    D      7
2    E    F      2
3    G    H      8
4    I    J      4