rlistvectorfilterrle

Filter a list of vectors using different run-length conditions


I have the following list of vectors:

vec1 <- c(rep(0, 10), rep(1, 4), rep(0,5), rep(-1,5))
vec2 <- c(rep(-1, 7), rep(0,99), rep(1, 6))
vec3 <- c(rep(1,2), rep(-1,2), rep(0,10), rep(-1,4), rep(0,8))
vec4 <- rep(0, 100)

dummy_list <- list(vec1, vec2, vec3, vec4)
names(dummy_list) <- c("first", "second", "third", "fourth")

I want to keep vectors that meet certain run-length condition.

Keep the vectors with at least 4 adjacent non-0 values (it works)

## first, second and third vectors
dummy_list2 <- Filter(function (x) any(with(rle(x), lengths[values! = 0] >= 4)),
                      dummy_list)

Keep vectors which contain something else than 0s (it works)

## first, second and third vectors
dummy_list3 <- Filter(function (x) which(sum(x) != 0),
                      dummy_list)

Keep vectors with less than 5 adjacent non-0 values (it doesn't work)

## should contain third and fourth vectors
## but gives me first and third vectors
dummy_list4 <- Filter(function(x) any(with(rle(x), lengths[values != 0] < 5)),
                      dummy_list)

How to make it work?


Solution

  • RLE is not suitable. You want to get the number of nonzero values in each length-w window of x:

    WindowNNZ <- function (x, w) rowSums(stats::embed(x != 0, w))
    
    ## vectors with at least 4 adjacent nonzero values
    dummy_list2 <- Filter(function(x) any(WindowNNZ(x, 4) == 4), dummy_list)
    names(dummy_list2)
    #[1] "first"  "second" "third"
    
    ## vectors with at least 1 nonzero value
    dummy_list3 <- Filter(function(x) any(WindowNNZ(x, 5) == 1), dummy_list)
    names(dummy_list3)
    #[1] "first"  "second" "third" 
    
    ## vectors with less than 5 adjacent nonzero values
    dummy_list4 <- Filter(function(x) all(WindowNNZ(x, 5) < 5), dummy_list)
    names(dummy_list4)
    #[1] "third"  "fourth"