pythonpandascountcumsumthreshold

Pandas Vectorized Operation - Making counting function that resets when threshold reaches 5


I am quite new to the programming and Im struggling to this matter. Any help is appreciated!

I have a dataframe of stocks including the prices and the signal if it will be up (1) or down (-1). I want to count the sequence of repetition into another column 'count'. So, when there is a sequence of 1,1,1; then the count will be 1,2,3. If its -1,-1,-1; then the count will be 1,2,3 too.

Additionally, when a threshold value reaches 5, the counting resets. Doesn't matter if it's 1 or -1.

So, what I have is:

     price  sign
0    13     1
1    12     1
2    11     -1
3    12     -1
4    13     1
5    14     1
6    14     1
7    14     1
8    14     1
9    14     1
10   14     1
.
.
.

And what I want is:

     price  sign  count
0    13     1       1
1    12     1       2
2    11     -1      1
3    12     -1      2
4    13     1       1
5    14     1       2
6    14     1       3
7    14     1       4
8    14     1       5
9    14     1       1
10   14     1       2
.
.
.

I already have this code in normal python code. But I cannot do this in Pandas Vectorized Operation! Help me, please!


Solution

  • Use GroupBy.cumcount by consecutive values of sign with modulo 5:

    df['count'] = df.groupby(df['sign'].ne(df['sign'].shift()).cumsum()).cumcount() % 5 + 1
    
    print (df)
        price  sign  count
    0      13     1      1
    1      12     1      2
    2      11    -1      1
    3      12    -1      2
    4      13     1      1
    5      14     1      2
    6      14     1      3
    7      14     1      4
    8      14     1      5
    9      14     1      1
    10     14     1      2
    

    Detail:

    print (df.assign(consecutive=df['sign'].ne(df['sign'].shift()).cumsum(),
                     counter=df.groupby(df['sign'].ne(df['sign'].shift()).cumsum()).cumcount(),
                     count = df.groupby(df['sign'].ne(df['sign'].shift()).cumsum()).cumcount() % 5 + 1))
        price  sign  consecutive  counter  count
    0      13     1            1        0      1
    1      12     1            1        1      2
    2      11    -1            2        0      1
    3      12    -1            2        1      2
    4      13     1            3        0      1
    5      14     1            3        1      2
    6      14     1            3        2      3
    7      14     1            3        3      4
    8      14     1            3        4      5
    9      14     1            3        5      1
    10     14     1            3        6      2