I have these data
Open High Low Close
1 6709.0 6709.5 6703.5 6705.0
2 6705.0 6710.5 6705.0 6710.0
3 6710.5 6713.5 6710.5 6713.5
4 6713.5 6713.5 6711.5 6712.0
5 6712.5 6715.0 6712.5 6714.0
6 6714.0 6716.0 6713.5 6716.0
which are OHLC
prices of 1
minute bars. I want to find OHLC
of 5 minute bars.
Open
and Close
are easy, but I don't know how to do High
and Low
.
I need to find the maximum/minimum
values in a subset of the data.
So first I need to find max(data[1:5,2])
then max(data[6:10,2])
and so on. And I need to put the values in a vector called High
. And for Low
prices min(data[1:5,3])
.
I need something like a for loop
with two variables.
I do not know how big is your dataset. However, for loop
would be slow in this case. Personally, I would rather go for an aggregate
function, specifically, the one already implemented in Rbase
.
The first thing we have to do is to create an index that will allow us to perform the function on each subgroup. Each subgroup contains OHLC of 5 minute bars.
Therefore, let's start by reading the data
read the table
df <- read.table(text = "Open High Low Close
+ 1 6709.0 6709.5 6703.5 6705.0
+
+ 2 6705.0 6710.5 6705.0 6710.0
+
+ 3 6710.5 6713.5 6710.5 6713.5
+
+ 4 6713.5 6713.5 6711.5 6712.0
+
+ 5 6712.5 6715.0 6712.5 6714.0")
Increase the size of df to better simulate the real use case scenario.
df <- rbind(df,df,df,df)
Create an index, first using vector recycling
and the cumsum()
function
df$group <- c(1,0,0,0,0)
df$group <- cumsum(df$group)
Now, finally we can obtain the Min and Max value for each 5 OHLC 5 minute bars and insert them into your final vectors, High
and Low
High <- aggregate(High ~ group, data = df, FUN = max)$High
Low <- aggregate(Low ~ group, data = df, FUN = min)$Low
A faster alternative to aggregate is given by data.table
package
require(data.table)
dt <- data.table(df)
High <- dt[ , max(High), by = group]$V1
Low <- dt[ , min(Low), by = group]$V1