rgroup-bytapplysplit-apply-combine

improvement on tapply (shifting groups of vectors)


The order of the return object from tapply() is ambiguous, so I've started to worry about this bit of code:

#d <- data.frame(value = c(1,2,3,5), 
#               source = c("a","a","b","b"))
d$value <- unlist(tapply(d$value, d$source, function(v) v-v[1]))

The idea is to split value into groups, then and then shift each group of its element so that the group starts off at 0.

What's a better way to do this? I have several ideas, but I can't think of anything that


Solution

  • A little-known trick in base R is the split(x, y) <- lapply(split(x, y), f) paradigm, so the following one-liner meets all your requirements:

    d <- data.frame(value = c(1,2,3,5), 
                    source = c("a","a","b","b"))
    
    split(d$value, d$source) <- lapply(split(d$value, d$source), \(x) x - x[1])
    

    Resulting in:

    d
    #>   value source
    #> 1     0      a
    #> 2     1      a
    #> 3     0      b
    #> 4     2      b
    

    Whatever the ordering of source

    Created on 2023-06-06 with reprex v2.0.2