pythonpandasdataframe

How to select rows that display some type of pattern in Python


I am looking to extract rows from my dataset based on a pattern like condition. The condition I'm looking for is finding periods in a battery's charging history where it discharged from 100-0% without charging in between.

For example, in this dataset below I would be interested in a function that would only return timestamp of 7 to 12 as it contains a full discharge of the battery.

timestamp Charge level (%)
1 50
2 55
3 40
4 60
5 80
6 100
7 100
8 85
9 60
10 55
11 40
12 0
13 20

The approach I have tried is to use the loc function in Pandas to look for rows with a charge level of 0% and then backtrack until I reach a row with a charge level of 100%. But I am struggling with the backtracking part in this approach.


Solution

  • The idea I use is to keep only the rows with 0 and 100 and the final rows of interest will be the ones with 100 followed by 0.

    I also updated your example to include some more difficult cases like when it start discharging and then start charging before it was fully discharged.

    with pd.option_context('display.max_columns', None):
        display(df.T)
    
            0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  17  18  19  20  21  22  23
    time    1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  17  18  19  20  21  22  23  24
    level   20  0   40  60  80  100 100 85  60  55  40  0   100 50  20  55  100 100 0   100 60  30  0   20
    
    mask =((df["level"] == 100) | (df["level"] == 0)) 
    res = df[mask].copy()
    
    mask1 = (res["level"] == 100) & (res["level"].shift(-1) == 0)
    mask2 = (res["level"] == 0) & (res["level"].shift(1) == 100)
    res = res.loc[mask1 | mask2, ["time", "level"]]
    display(res)
    
        time    level
    6   7   100
    11  12  0
    17  18  100
    18  19  0
    19  20  100
    22  23  0
    

    where we have the start and end index of all the segments of interest.