sqlpostgresqlfinanceohlc

PostgreSQL resample 1 minute OHLCV candle data into 5 minute OHLCV candle data


I have a postgreSQL database with the following columns:

timestamp <bigint> - Unix timestamp in milliseconds
open      <float>  - The first traded price
high      <float>  - The highest traded price
low       <float>  - THe lowest traded price
close     <float>  - The last traded price
volume    <float>  - The total volumed transacted during the time period

Sometimes I'll query data that spans 30+ days (43.2k rows). These queries take a really long time, so I thought whenever I have >30 days of data I'll fetch 5 minute candles instead which would cut down the row count by 5x.

Basically I'm looking for a function that does this (pseudocode):

SELECT 
first_value(timestamp),
first_value(open),
max(high),
min(low),
last_value(close),
sum(volume)
WHERE timestamp > {some timestamp in the past} AND timestamp < {current time}
GROUP BY floor((timestamp / 1000) / (60*5)) * 5
ORDER BY timestamp

Any help would be greatly appreciated, been stuck on this one for a while


Solution

  • Used a subquery to get the column: minute5. On it made grouping and sorting. To get the first value, 'open' use array_agg with sorting and took the value through square brackets. The same with 'close', but in the other direction.

    SELECT 
    min(timestamp) AS timestamp, 
    (array_agg(open ORDER BY timestamp))[1] AS open, 
    max(high) AS high, 
    min(low) AS low,  
    (array_agg(close ORDER BY timestamp DESC))[1] AS close, 
    sum(volume) volume
    FROM
    (SELECT*, floor((timestamp/1000) / (60*5)) * 5 AS minute5
    FROM stock
    ORDER BY timestamp
    ) t
    GROUP BY
    minute5
    ORDER BY
    minute5
    ;
    

    Here's the link: creating a table and getting data (added a dt column with a type to visually see what date and time for each row).