I have 5 data frames, split from one according to a variable, to which I want to apply the same function based on the same 3 columns from each data frame. Each contains 10,000 rows.
My data:
Dist X Y deg ofs Z
1 20.21 499.3 3577 4.77 0 19.750
2 20.23 482.3 3578 4.77 -50 19.731
3 20.23 481.3 3578 4.77 -25 19.741
4 20.23 480.3 3578 4.77 0 19.750
5 20.23 479.3 3578 4.77 25 19.749
6 20.24 478.3 3578 4.77 50 19.740
Split like this:
splitdf <- split(df, df$ofs)
str(offset)
X1 <- splitdf$`-50`
X2 <- splitdf$'-25'
X3 <- splitdf$'0'
X4 <- splitdf$'25'
X5 <- splitdf$'50'
df.list <- list(X1,X2,X3,X4,X5)
I have created two functions of trig:
(X + distance * cos(angle)), (Y - distance * sin(angle))
NewX <- function(x){
df.list[[i]][2] + df.list[[i]][5] * cos(df.list[[i]][4])
}
NewY <- function(x) {
df.list[[i]][3] - df.list[[i]][5] * sin(df.list[[i]][4])
}
I then created a loop to apply these functions to each data frame, thus creating new columns.
for (i in 1:length(df.list)){
df.list[[i]]$newcol1 <- lapply(df.list[[i]]$X, FUN=NewX)
df.list[[i]]$newcol2 <- lapply(df.list[[i]]$Y, FUN=NewY)
}
Unfortunately this yields no results nor error messages. But the console is busy for a few minutes.
I tried again with the data before splitting to separate data frames using:
NewX <- function(x){
df[2] + df[5] * cos(df[4])
}
NewY <- function(x) {
df[3] - df[5] * sin(df[4])
}
for (i in 1:length(df)){
df$newX <- lapply(df$X, FUN=NewX)
df$newY <- lapply(df$Y, FUN=NewY)
}
This way is too heavy and does not yield result after one hour. In either case I don't get any error messages so it is very difficult to know what I'm doing wrong.
Does anyone have any ideas? Thanks!
EDIT
I ran the loop over the single file changing the code to add output as a new data frame.
for (i in 1:length(df)){
lapply(df$X, FUN=NewX)
lapply(df$Y, FUN=NewY) -> newdf
}
A NewX
column is created, and inside each cell is a single-column data frame with 50,000 results.
Removing the loop and running with a pipe yields Error in FUN(X[[i]],...): Unused argument
Actually you could do that with by
.
fun <- function(x) cbind(x, newcol1=x[, 2] + x[, 5]*cos(x[, 4]), newcol2=x[, 3] - x[, 5]*sin(x[, 4]))
by(df, df$ofs, fun)
# df$ofs: -50
# Dist X Y deg ofs Z newcol1 newcol2
# 2 20.23 482.3 3578 4.77 -50 19.731 479.421 3528.083
# ---------------------------------------------------------------------------------------------
# df$ofs: -25
# Dist X Y deg ofs Z newcol1 newcol2
# 3 20.23 481.3 3578 4.77 -25 19.741 479.8605 3553.041
# ---------------------------------------------------------------------------------------------
# df$ofs: 0
# Dist X Y deg ofs Z newcol1 newcol2
# 1 20.21 499.3 3577 4.77 0 19.75 499.3 3577
# 4 20.23 480.3 3578 4.77 0 19.75 480.3 3578
# ---------------------------------------------------------------------------------------------
# df$ofs: 25
# Dist X Y deg ofs Z newcol1 newcol2
# 5 20.23 479.3 3578 4.77 25 19.749 480.7395 3602.959
# ---------------------------------------------------------------------------------------------
# df$ofs: 50
# Dist X Y deg ofs Z newcol1 newcol2
# 6 20.24 478.3 3578 4.77 50 19.74 481.179 3627.917
If you plan to reassemble it:
do.call(rbind, by(df, df$ofs, fun))
# Dist X Y deg ofs Z newcol1 newcol2
# -50 20.23 482.3 3578 4.77 -50 19.731 479.4210 3528.083
# -25 20.23 481.3 3578 4.77 -25 19.741 479.8605 3553.041
# 0.1 20.21 499.3 3577 4.77 0 19.750 499.3000 3577.000
# 0.4 20.23 480.3 3578 4.77 0 19.750 480.3000 3578.000
# 25 20.23 479.3 3578 4.77 25 19.749 480.7395 3602.959
# 50 20.24 478.3 3578 4.77 50 19.740 481.1790 3627.917