rdatextslags

How to lag date-index in a time-series in R?


REWRITTEN QUESTION HERE:

I've made some progress but am getting odd behaviour from R...

Here's the xts I'm starting with

<no title>  Value   Value2  Value3
2002-08-21  21      2       27
2003-09-10  22      42      87
2004-02-12  23      62      67
2005-04-13  24      13      73
2006-05-13  25      4       28
2007-08-14  20      68      25
2008-03-06  19      82      22

What I want to produce:

 <no title> Value   Value2  Value3  ThisDate    NextDate
    2002-08-21  21      2       27      2002-08-21  2003-09-10
    2003-09-10  22      42      87      2003-09-10  2004-02-12
    2004-02-12  23      62      67      2004-02-12  2005-04-13
    2005-04-13  24      13      73      2005-04-13  2006-05-13
    2006-05-13  25      4       28      2006-05-13  2007-08-14
    2007-08-14  20      68      25      2007-08-14  2008-03-06
    2008-03-06  19      82      22      2008-03-06  NA

I've written a function like this:

StackUpAdjacentDates <- function(sourceTimeSeries)
{
    returnValue <- sourceTimeSeries

    thisDate <- as.character(index(sourceTimeSeries))
    nextDate <- c(as.character(thisDate[2:length(thisDate)]),NA)

    thisDate <- as.Date(strptime(thisDate, "%Y-%m-%d"))
    nextDate <- as.Date(strptime(nextDate, "%Y-%m-%d"))

    # set up thisDate in a new column
    if ("thisDate" %in% colnames(returnValue) )
    {
        returnValue<-returnValue[,-which(colnames(returnValue)=="thisDate")]
    }
    returnValue <- cbind(returnValue, thisDate)
    colnames(returnValue)[ncol(returnValue)] <- "thisDate"
    returnValue$thisDate <- thisDate

    # add nextDate in a new column
    if ("nextDate" %in% colnames(returnValue) )
    {
        returnValue<-returnValue[,-which(colnames(returnValue)=="nextDate")]
    }
    returnValue <- cbind(returnValue,nextDate)
    colnames(returnValue)[ncol(returnValue)] <- "nextDate"
    #returnValue$nextDate <- nextDate

}

This successfully adds thisDate (running the code step-wise at the command-line). But the bit that adds nextDate seems to over-write it! I also seem to get an unexpected row of NAs. Still working on this...

<no title>  Value   Value2  Value3  nextDate
2002-08-21  21      78      76      12305
2003-09-10  22      70      23      12460
2004-02-12  23      84      22      12886
2005-04-13  24      97      28      13281
2006-05-13  25      26      97      13739
2007-08-14  20      59      22      13944
2008-03-06  19      64      98      NA
<NA>        NA      NA      NA      NA

I've put "no title" in the first column to indicate that it's the xts date-index rather than actually a part of the vector/matrix.

The bit about removing the extra row is because I've not yet solved the over-write problem and was experimenting. It doesn't need to be there in the final answer but is where I am up to at present.

And lastly, when I interrogate this result and try to convert nextDate to a date I get....

> as.Date(returnValue$nextDate)
Error in as.Date.default(returnValue$nextDate) : 
  do not know how to convert 'returnValue$nextDate' to class "Date"

So I'm in a bit of a muddle...

ORIGINAL QUESTION BELOW:

I have a time-series in R (which I am learning fast, but clearly not fast enough!) like this

             Value
2002-08-21    21
2003-09-10    22
2004-02-12    23
2005-04-13    24
2006-05-13    25
2007-08-14    20
2008-03-06    19

I want to create a derivative of it with the date-index in the NEXT row in a new column in each row:

              Value    NextDate
2002-08-21    21       2003-09-10
2003-09-10    22       2004-02-12
2004-02-12    23       2005-04-13
2005-04-13    24       2006-05-13
2006-05-13    25       2007-08-14
2007-08-14    20       2008-03-06
2008-03-06    19       [...]

It's pretty easy to do for Value (using Lag) but not for the date-index iteself.

I can probably work out how to do it using various lookups and the like, but it is messy. You have to match on some other field, or fiddle around with row-numbers which doesn't feel very "true to R".

Is there a nice, neat, elegant way to do it?

I'm pretty sure I'll go "D'OH!" as soon as someone gives the answer! But so far I haven't found an answer on this site for lagging the date-index.

The reason I want to do this is I then want to use each pair of dates in a row to interrogate another series. So there might be a better way to do this.


Solution

  • I think this is similar to what you actually want to do:

    library(xts)
    #create example xts
    times <- seq(as.Date('2002-08-21'),as.Date('2002-09-06 '),by="day")
    myts <- xts(x=1:length(times),order.by=times)
    
    #second xts, with start and end times
    times2 <- c("2002-08-21","2002-08-31","2002-09-06")    
    myts2 <- myts[times2] 
    
    #get start and end times
    ix <- index(myts2)
    
    #get positions in myts
    ep <- which(index(myts) %in% ix)-1
    
    #calculate means
    period.apply(myts,ep,mean) 
    

    Note: This includes the starting time and excludes the end time, when calculating the period mean.