I have a dataframe with stock OHLC and would like to find how many times it crosses option strikes ( a single summary statistic).
dataframe
open high low close volume datetime datetime2 n_strike strk_diff pinned_min
datetime2
2021-08-20 09:30:00-04:00 147.4400 147.5619 147.1201 147.3725 1660122.0 1629466200000 2021-08-20 13:30:00+00:00 145 2.3725 1
2021-08-20 09:31:00-04:00 147.3800 147.6600 147.1200 147.1350 430097.0 1629466260000 2021-08-20 13:31:00+00:00 145 2.1350 1
2021-08-20 09:32:00-04:00 147.1297 147.4800 147.0400 147.0550 308090.0 1629466320000 2021-08-20 13:32:00+00:00 145 2.0550 1
2021-08-20 09:33:00-04:00 147.1000 147.3199 147.0200 147.2348 285100.0 1629466380000 2021-08-20 13:33:00+00:00 145 2.2348 1
2021-08-20 09:34:00-04:00 147.2367 147.2600 146.9600 147.1250 290185.0 1629466440000 2021-08-20 13:34:00+00:00 145 2.1250 1
... ... ... ... ... ... ... ... ... ... ...
2022-07-15 15:55:00-04:00 149.8900 149.9800 149.8400 149.9550 525630.0 1657914900000 2022-07-15 19:55:00+00:00 150 0.0450 0
2022-07-15 15:56:00-04:00 149.9600 150.0000 149.9100 149.9900 675573.0 1657914960000 2022-07-15 19:56:00+00:00 150 0.0100 0
2022-07-15 15:57:00-04:00 149.9900 150.0000 149.9400 149.9900 464692.0 1657915020000 2022-07-15 19:57:00+00:00 150 0.0100 0
2022-07-15 15:58:00-04:00 149.9900 150.0500 149.9200 150.0300 753358.0 1657915080000 2022-07-15 19:58:00+00:00 150 0.0300 0
2022-07-15 15:59:00-04:00 150.0300 150.2500 149.9700 150.1700 1978823.0 1657915140000 2022-07-15 19:59:00+00:00 150 0.1700
for each row in the dataframe I want to know how many of the strike prices it crosses and thus creating a new column. my code is as follows:
#make a list of the strikes
strikes = [*range(0,(round(df_expfri['high'].max())+5), 5)]
for row in df_temp:
H = df_temp['high']
L = df_temp['low']
count = 0
for x in strikes:
if x < L :
continue
elif x > H:
continue
elif x > L & x < H:
count +=1
print (count)
and the error i'm getting is below. If I'm interpreting the error correctly; I believe my variable H and L are series' and that is what is causing my problem but am unsure of how to resolve it.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [133], in <cell line: 7>()
10 count = 0
11 for x in strikes:
---> 12 if x < L :
13 continue
14 elif x > H:
File C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py:1535, in NDFrame.__nonzero__(self)
1533 @final
1534 def __nonzero__(self):
-> 1535 raise ValueError(
1536 f"The truth value of a {type(self).__name__} is ambiguous. "
1537 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
1538 )
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
thank you in advance
The problem is that even though you are declaring the row variable on for you are accessing it directly from the dataframe.
Replace the for line with:
for _, row in df_temp.iterrows():
And the H and L variables:
H = row['high']
L = row['low']