So when loading kraken.com historical datafeed into pandas to convert from tick data (time and sales) to OHLC (Open,High,Low,Close) data I run into an issue where I am not able to resample for 'high' & 'low' as they throw errors, but 'first', 'last', and 'sum' work just fine, I don't know why or what the available options are as it seems there is an enumerated list of options (total guess there).
The code below takes heavy inspiration (out right copping blindly) from ryantjo's Resampling Market Tick Data.
high_df = df.resample("1H").agg({'trade_price': 'high'})
AttributeError: 'SeriesGroupBy' object has no attribute 'high'
Full Code:
import pandas as pd
df = pd.read_csv('C:/source/tradingdata/Kraken/STORJETH.csv', names=[
'ms', 'trade_price', 'volume'], parse_dates=True)
df['td'] = pd.to_timedelta(df['ms'], 'ms')
df = df.set_index(df['td'])
df.head()
open_df = df.resample("1H").agg({'trade_price': 'first'})
low_df = df.resample("1H").agg({'trade_price': 'high'})
low_df = df.resample("1H").agg({'trade_price': 'low'})
close_df = df.resample("1H").agg({'trade_price': 'last'})
volume_df = df.resample("1H").agg({'volume': 'sum'})
... More code that hasn't even been tested yet.
If you want to extract the highest value from each group you need to use max
high_df = df.resample("1H").agg({'trade_price': 'max'})
In general the agg()
function takes a dictionary where keys are the column names and the value has to be the name of a pandas function.