rxts

XTS: split FX intraday bar data by trading days


I want to apply a function to 20 trading days worth of hourly FX data (as one example amongst many).

I started off with rollapply(data,width=20*24,FUN=FUN,by=24). That seemed to be working well, I could even assert I always got 480 bars passed in... until I realized that wasn't what I wanted. The start and end time of those 480 bars was drifting over the years, due to changes in daylight savings, and market holidays.

So, what I want is a function that treats a day as from 22:00 to 22:00 of each day we have data for. (21:00 to 21:00 in N.Y. summertime - my data timezone is UTC, and daystart is defined at 5pm ET)

So, I made my own rollapply function with this at its core:

 ep=endpoints(data,on=on,k=k) 
 sp=ep[1:(length(ep)-width)]+1
 ep=ep[(width+1):length(ep)]
 xx <- lapply(1:length(ep), function(ix) FUN(.subset_xts(data,sp[ix]:ep[ix]),...) )

I then called this with on="days", k=1 and width=20.

This has two problems:

  1. Days is in days, not trading days! So, instead of typically 4 weeks of data, I get just under 3 weeks of data.
  2. The cutoff is midnight UTC. I cannot work out how to change it to use the 22:00 (or 21:00) cutoff.

UPDATE: Problem 1 above is wrong! The XTS endpoints function does work in trading days, not calendar days. The reason I thought otherwise is the timezone issue made it look like a 6-day trading week: Sun to Fri. Once the timezone problem was fixed (see my self-answer), using width=20 and on="days" does indeed give me 4 weeks of data.

(The typically there is important: when there is a trading holiday during those 4 weeks I expect to receive 4 weeks 1 day's worth of data, i.e. always exactly 20 trading days.)

I started working on a function to cut the data into weeks, thinking I could then cut them into five 24hr chunks, but this feels like the wrong approach, and surely someone has invented this wheel before me?


Solution

  • Here is how to get the daybreak right:

    x2=x
    index(x2)=index(x2)+(7*3600)
    indexTZ(x2)='America/New_York'
    

    I.e. just setting the timezone puts the daybreak at 17:00; we want it to be at 24:00, so add 7 hours on first.

    With help from: time zones in POSIXct and xts, converting from GMT in R

    Here is the full function:

    rollapply_chunks.FX.xts=function(data,width,FUN,...,on="days",k=1){
    data <- try.xts(data)
    
    x2 <- data
    index(x2) <- index(x2)+(7*3600)
    indexTZ(x2) <- 'America/New_York'
    
    ep <- endpoints(x2,on=on,k=k)    #The end point of each calendar day (when on="days").
        #Each entry points to the final bar of the day. ep[1]==0.
    
    if(length(ep)<2){
        stop("Cannot divide data up")
    }else if(length(ep)==2){  #Can only fit one chunk in.
        sp <- 1;ep <- ep[-1]
    }else{
        sp <- ep[1:(length(ep)-width)]+1
        ep <- ep[(width+1):length(ep)]
    }
    
    xx <- lapply(1:length(ep), function(ix) FUN(.subset_xts(data,sp[ix]:ep[ix]),...) )
    xx <- do.call(rbind,xx)   #Join them up as one big matrix/data.frame.
    
    tt <- index(data)[ep]  #Implicit align="right". Use sp for align="left"
    res <- xts(xx, tt)
    return (res)
    }
    

    You can see we use the modified index to split up the original data. (If R uses copy-on-write under the covers, then the only extra memory requirement should be for a copy of the index, not of the data.)

    (Legal bit: please consider it licensed under MIT, but explicit permission given to use in the GPL-2 XTS package if that is desired.)