rdataframedifference

Calculate the difference between successive observations rowwise in a dataframe


I have this hypothetical dataframe

df<- data.frame(a_13=c(NA, 1, 2),
                a_12=c(NA, 4, 3), 
                a_11=c(NA, 2,NA))
df 
  a_13 a_12 a_11
1   NA   NA   NA
2    1    4    2
3    2    3   NA

I want to calculate the difference between successive obs rowwise in the above dataframe, that is, I want to augment the above dataframe with 3 additional columns (I_13, I_12, I_11), where each row of these new columns will result from successive difference of the elements of the corresponding row of columns (a_13, a_12, a_11), that is, (a_13 - a_12, a_12 - a_11, a_11)

So my new data frame should be

df_D <- data.frame(a_13=c(NA, 1, 2),
                   a_12=c(NA, 4, 3), 
                   a_11=c(NA, 2,NA),
                   I_13=c(NA-NA,1-4, 2-3),
                   I_12=c(NA-NA, 4-2,3-NA),
                   I_11=c(NA,2,NA))

df_D
  a_13 a_12 a_11 I_13 I_12 I_11
1   NA   NA   NA   NA   NA   NA
2    1    4    2   -3    2    2
3    2    3   NA   -1   NA   NA

The last column I_11 is essentially a_11.

My real dataframe contains hundreds of thousands of rows and a_13 up to a_1. So a more general code would help.

I tried something like

NN<-names(df)
for (j in NN){
  df$N<-df[,which(NN==j)] - df[,which(NN==j)+1]
  re<-paste("I", j, sep="_")
  names(df)[ncol(df)]<-re
}
 

But I do not get the result that I want


Solution

  • You can use diff(). Note that if diff is applied on a matrix, the difference operations are carried out on each column separately. Thus, we need to transpose the data before diff() and transpose it back thereafter.

    t(-diff(t(cbind(df, 0))))
    
    #      a_12 a_11  0
    # [1,]   NA   NA NA
    # [2,]   -3    2  2
    # [3,]   -1   NA NA
    

    To combine it with the original data:

    df[sub("^a", "I", names(df))] <- t(-diff(t(cbind(df, 0))))
    df
    
    #   a_13 a_12 a_11 I_13 I_12 I_11
    # 1   NA   NA   NA   NA   NA   NA
    # 2    1    4    2   -3    2    2
    # 3    2    3   NA   -1   NA   NA