rstocktrading

Create 5 min bars out of 1 min bars in R


I have these data

  Open   High   Low    Close
1 6709.0 6709.5 6703.5 6705.0
2 6705.0 6710.5 6705.0 6710.0
3 6710.5 6713.5 6710.5 6713.5
4 6713.5 6713.5 6711.5 6712.0
5 6712.5 6715.0 6712.5 6714.0
6 6714.0 6716.0 6713.5 6716.0

which are OHLC prices of 1 minute bars. I want to find OHLC of 5 minute bars. Open and Close are easy, but I don't know how to do High and Low.

I need to find the maximum/minimum values in a subset of the data. So first I need to find max(data[1:5,2]) then max(data[6:10,2]) and so on. And I need to put the values in a vector called High. And for Low prices min(data[1:5,3]).

I need something like a for loop with two variables.


Solution

  • I do not know how big is your dataset. However, for loop would be slow in this case. Personally, I would rather go for an aggregate function, specifically, the one already implemented in Rbase.

    The first thing we have to do is to create an index that will allow us to perform the function on each subgroup. Each subgroup contains OHLC of 5 minute bars.

    Therefore, let's start by reading the data

    read the table

    df <- read.table(text = "Open   High    Low  Close
    + 1 6709.0 6709.5 6703.5 6705.0
    + 
    + 2 6705.0 6710.5 6705.0 6710.0
    + 
    + 3 6710.5 6713.5 6710.5 6713.5
    + 
    + 4 6713.5 6713.5 6711.5 6712.0
    + 
    + 5 6712.5 6715.0 6712.5 6714.0")
    

    Increase the size of df to better simulate the real use case scenario.

    df <- rbind(df,df,df,df)
    

    Create an index, first using vector recycling and the cumsum() function

    df$group <- c(1,0,0,0,0)
    df$group <- cumsum(df$group)
    

    Now, finally we can obtain the Min and Max value for each 5 OHLC 5 minute bars and insert them into your final vectors, High and Low

    High <- aggregate(High ~ group, data = df, FUN = max)$High
    Low <- aggregate(Low ~ group, data = df, FUN = min)$Low
    

    A faster alternative to aggregate is given by data.table package

    require(data.table)
    dt <- data.table(df)
    High <- dt[ , max(High), by = group]$V1
    Low <- dt[ , min(Low), by = group]$V1