pythonpandasdataframe

Pandas Extract Sequence where prev value > current value


Need to extract sequence of negative values where earlier negative value is smaller than current value and next value is smaller than current value

import pandas as pd

# Create the DataFrame with the given values
data = {
    'Value': [0.3, 0.2, 0.1, -0.1, -0.2, -0.3, -0.4, -0.35, -0.25, 0.1, -0.15, -0.25, -0.13, -0.1, 1]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

My Code:

# Initialize a list to hold the sequences
sequences = []
current_sequence = []

# Iterate through the DataFrame to apply the condition
for i in range(1, len(df) - 1):
    prev_value = df.loc[i - 1, 'Value']
    curr_value = df.loc[i, 'Value']
    next_value = df.loc[i + 1, 'Value']
    
    # Check the condition
    if curr_value < prev_value and curr_value < next_value:
        current_sequence.append(curr_value)
    else:
        # If the current sequence is not empty and it's a valid sequence, add it to sequences list and reset
        if current_sequence:
            sequences.append(current_sequence)
            current_sequence = []

# Add the last sequence if it's not empty
if current_sequence:
    sequences.append(current_sequence)

My Output:

Extracted Sequences:
[-0.4]
[-0.25]

Expected Output:

[-0.1,-0.2,-0.3,-0.4]
[-0.15,-0.25]

Solution

  • You can build masks to identify the negative values and consecutive decreasing values and use groupby to split:

    # is the value negative?
    m1 = df['Value'].lt(0)
    
    # is the value decreasing?
    m2 = df['Value'].diff().le(0)
    
    m = m1&m2
    
    # aggregate
    out = df[m].groupby((~m).cumsum())['Value'].agg(list).tolist()
    

    Output:

    [[-0.1, -0.2, -0.3, -0.4], [-0.15, -0.25]]
    

    If you just want to filter:

    out = df[m]
    

    Output:

        Value
    3   -0.10
    4   -0.20
    5   -0.30
    6   -0.40
    10  -0.15
    11  -0.25
    

    Intermediates:

        Value     m1  df['Value'].diff()     m2  m1&m2
    0    0.30  False                 NaN  False  False
    1    0.20  False               -0.10   True  False
    2    0.10  False               -0.10   True  False
    3   -0.10   True               -0.20   True   True
    4   -0.20   True               -0.10   True   True
    5   -0.30   True               -0.10   True   True
    6   -0.40   True               -0.10   True   True
    7   -0.35   True                0.05  False  False
    8   -0.25   True                0.10  False  False
    9    0.10  False                0.35  False  False
    10  -0.15   True               -0.25   True   True
    11  -0.25   True               -0.10   True   True
    12  -0.13   True                0.12  False  False
    13  -0.10   True                0.03  False  False
    14   1.00  False                1.10  False  False