pythonpandastechnical-indicator

How to calculate relative volume using pandas with faster way?


I am trying to implement the RVOL by the time of day technical indicator, which can be used as the indication of market strength.

The logic behind this is as follows:

If the current time is 2022/3/19 13:00, we look through the same moment (13:00) at the previous N days and average all the previous volumes at that moment to calculate Average_volume_previous. Then, RVOL(t) is volume(t)/Average_volume_previous(t).

It is hard to use methods like rolling and apply to deal with this complex logic in the code I wrote. However, the operation time of for loop is catastrophically long.

from datetime import datetime
import pandas as pd
import numpy as np

datetime_array = pd.date_range(datetime.strptime('2015-03-19 13:00:00', '%Y-%m-%d %H:%M:%S'), datetime.strptime("2022-03-19 13:00:00", '%Y-%m-%d %H:%M:%S'), freq='30min')
volume_array = pd.Series(np.random.uniform(1000, 10000, len(datetime_array)))
df = pd.DataFrame({'Date':datetime_array, 'Volume':volume_array})
df.set_index(['Date'], inplace=True)

output = []
for idx in range(len(df)):
    date = str(df.index[idx].hour)+':'+str(df.index[idx].minute)
    temp_date = df.iloc[:idx].between_time(date, date)
    output.append(temp_date.tail(day_len).mean().iloc[0])

output = np.array(output)

Practically, there might be missing data in the datetime array. So, it would be hard to use fixed length lookback period to solve this. Is there any way to make this code work faster?


Solution

  • I'm not sure I understand, however this is the solution as far as I understand. I didn't use date as index df.set_index(['Date'], inplace=True)

    # Filter data to find instant 
    rolling_day = 10
    hour = df['Date'].dt.hour == 13
    minute = df['Date'].dt.minute == 0
    df_moment = df[ore&minuti].copy()
    

    Calculation of moving averages

    df_moment['rolling'] = df_moment.rolling(rolling_day).mean()
    

    Calculation of Average_volume_previous(t)/volume(t)

    for idx_s, idx_e in zip(df_moment['Volume'][::rolling_day], df_moment['rolling'][rolling_day::rolling_day]):
         print(f'{idx_s/idx_e}')
    

    Output:

    0.566379345408499
    0.7229214799940626
    0.6753586759429548
    2.0588617812341354
    0.7494803741982076
    1.2132554086225438