pythonpandasdataframe

How to select / save rows with multiple same value in Pandas?


I have financial data where I need to save / find rows that have multiple same value and a condition where the same value happened more than / = 2 and not (value)equal to 0 or < 1.

Say I have this:

                A       B       C       D       E       F       G       H       I
5/7/2025 21:00  0   0   0   0   0   0   0   0
5/7/2025 21:15  0   0   19598.8 0   19598.8 0   0   0
5/7/2025 21:30  0   0   0   0   0   0   0   0
5/7/2025 21:45  0   0   0   19823.35    0   0   0   0
5/7/2025 22:00  0   0   0   0   0   0   0   0
5/7/2025 22:15  0   0   0   0   0   0   0   0
5/7/2025 22:30  0   0   0   19975.95    0   19975.95    0   19975.95
5/7/2025 23:45  0   0   0   0   0   0   0   0
5/8/2025 1:00   0   0   19830.2 0   0   0   0   0
5/8/2025 1:15   0   0   0   0   0   0   0   0
5/8/2025 1:30   0   0   0   0   0   0   0   0
5/8/2025 1:45   0   0   0   0   0   0   0   0

I want this along with other datas in those rows:

                A       B       C       D       E       F       G       H       I
5/7/2025 21:15  0   0   19598.8 0   19598.8 0   0   0
5/7/2025 22:30  0   0   0   19975.95    0   19975.95    0   19975.95

Solution

  • A simple approach could be to select the columns of interest, then identify if any value is duplicated within a row. Then select the matching rows with boolean indexing:

    mask = df.loc[:, 'B':].T
    
    out = df[mask.apply(lambda x: x.duplicated(keep=False)).where(mask >= 1).any()]
    

    A potentially more efficient approach could be to use . Select the values, mask the values below 1, sort them and identify if any 2 are identical in a row with diff + isclose:

    mask = df.loc[:, 'B':].where(lambda x: x>=1).values
    mask.sort()
    out = df[np.isclose(np.diff(mask), 0).any(axis=1)]
    

    Output:

                    A  B  C        D         E        F         G  H         I
    1  5/7/2025 21:15  0  0  19598.8      0.00  19598.8      0.00  0      0.00
    6  5/7/2025 22:30  0  0      0.0  19975.95      0.0  19975.95  0  19975.95