I am quite new to the programming and Im struggling to this matter. Any help is appreciated!
I have a dataframe of stocks including the prices and the signal if it will be up (1) or down (-1). I want to count the sequence of repetition into another column 'count'. So, when there is a sequence of 1,1,1; then the count will be 1,2,3. If its -1,-1,-1; then the count will be 1,2,3 too.
Additionally, when a threshold value reaches 5, the counting resets. Doesn't matter if it's 1 or -1.
So, what I have is:
price sign
0 13 1
1 12 1
2 11 -1
3 12 -1
4 13 1
5 14 1
6 14 1
7 14 1
8 14 1
9 14 1
10 14 1
.
.
.
And what I want is:
price sign count
0 13 1 1
1 12 1 2
2 11 -1 1
3 12 -1 2
4 13 1 1
5 14 1 2
6 14 1 3
7 14 1 4
8 14 1 5
9 14 1 1
10 14 1 2
.
.
.
I already have this code in normal python code. But I cannot do this in Pandas Vectorized Operation! Help me, please!
Use GroupBy.cumcount
by consecutive values of sign
with modulo 5
:
df['count'] = df.groupby(df['sign'].ne(df['sign'].shift()).cumsum()).cumcount() % 5 + 1
print (df)
price sign count
0 13 1 1
1 12 1 2
2 11 -1 1
3 12 -1 2
4 13 1 1
5 14 1 2
6 14 1 3
7 14 1 4
8 14 1 5
9 14 1 1
10 14 1 2
Detail:
print (df.assign(consecutive=df['sign'].ne(df['sign'].shift()).cumsum(),
counter=df.groupby(df['sign'].ne(df['sign'].shift()).cumsum()).cumcount(),
count = df.groupby(df['sign'].ne(df['sign'].shift()).cumsum()).cumcount() % 5 + 1))
price sign consecutive counter count
0 13 1 1 0 1
1 12 1 1 1 2
2 11 -1 2 0 1
3 12 -1 2 1 2
4 13 1 3 0 1
5 14 1 3 1 2
6 14 1 3 2 3
7 14 1 3 3 4
8 14 1 3 4 5
9 14 1 3 5 1
10 14 1 3 6 2