in a grouped data frame, i would like to apply a function that relates the one value in the actual a row to all other values of the group (and same column) except the one iin the present row. This will lead to a single value new variable. So if the group consists of c(1,2,3,4,5) i would like to have a new variable with: c(fun(1,c(2,3), fun(2, c(1,3), fun(3, c(1,2)) My groups do not have similar size. But trying so long, i always receive funny values like zeroes or errors.
Example code:
set.seed(3)
dat <- data_frame(a=1:10,value=round(runif(10),2),group=c(1,1,1,2,2,3,3,3,3,4))
# one possible function
dif.dist <- function(x1, x2) sum(abs(x1 - x2))/(length(x2)-1)
# with this, sometimes the grouping gets lost in "vec" and i receive zeros
x <- dat%>%
group_by(group)%>%
mutate(vec= list(value))%>%
mutate(dif = dif.dist(unique(value),unlist(vec)[unlist(vec)!=value]))%>%
ungroup()
# another try with plyr, that returns only 0
dat <- ddply(dat, .(group), mutate, dif=dif.dist1(value[a==a],value[value!=value[a==a]]))
but the function works
dif.dist(dat$value[1],dat$value[2:3])
[1] 0.85
Later, i need this to receive distance matrices of a large set of variables related to each participant. I would be thankful for Help!
One option would be to loop over the sequence of rows after grouping by 'group' and subset the elements of 'value' based on the index
library(dplyr)
library(purrr)
out <- dat %>%
group_by(group) %>%
mutate(dif = map_dbl(row_number(), ~ dif.dist(value[.x], value[-.x])))
head(out, 2)
# A tibble: 2 x 4
# Groups: group [1]
# a value group dif
# <int> <dbl> <dbl> <dbl>
#1 1 0.17 1 0.85
#2 2 0.81 1 1.07