I have this hypothetical dataframe
df<- data.frame(a_13=c(NA, 1, 2),
a_12=c(NA, 4, 3),
a_11=c(NA, 2,NA))
df
a_13 a_12 a_11
1 NA NA NA
2 1 4 2
3 2 3 NA
I want to calculate the difference between successive obs rowwise in the above dataframe, that is,
I want to augment the above dataframe with 3 additional columns (I_13
, I_12
, I_11
), where each
row of these new columns will result from successive difference of the elements of the corresponding
row of columns (a_13
, a_12
, a_11
), that is, (a_13 - a_12
, a_12 - a_11
, a_11
)
So my new data frame should be
df_D <- data.frame(a_13=c(NA, 1, 2),
a_12=c(NA, 4, 3),
a_11=c(NA, 2,NA),
I_13=c(NA-NA,1-4, 2-3),
I_12=c(NA-NA, 4-2,3-NA),
I_11=c(NA,2,NA))
df_D
a_13 a_12 a_11 I_13 I_12 I_11
1 NA NA NA NA NA NA
2 1 4 2 -3 2 2
3 2 3 NA -1 NA NA
The last column I_11
is essentially a_11
.
My real dataframe contains hundreds of thousands of rows and a_13
up to a_1
. So a more general code would help.
I tried something like
NN<-names(df)
for (j in NN){
df$N<-df[,which(NN==j)] - df[,which(NN==j)+1]
re<-paste("I", j, sep="_")
names(df)[ncol(df)]<-re
}
But I do not get the result that I want
You can use diff()
. Note that if diff
is applied on a matrix, the difference operations are carried out on each column separately. Thus, we need to transpose the data before diff()
and transpose it back thereafter.
t(-diff(t(cbind(df, 0))))
# a_12 a_11 0
# [1,] NA NA NA
# [2,] -3 2 2
# [3,] -1 NA NA
To combine it with the original data:
df[sub("^a", "I", names(df))] <- t(-diff(t(cbind(df, 0))))
df
# a_13 a_12 a_11 I_13 I_12 I_11
# 1 NA NA NA NA NA NA
# 2 1 4 2 -3 2 2
# 3 2 3 NA -1 NA NA