rvectorsplit

R: Split A Vector Into Subvectors with Overlaps and Rebounding


I want to split a parent vector called v into some subvectors with the following conditions:

Edited for clarity

  1. Each sub-vector has an equal length l which is less than the number of the parent vector v.

  2. Each sub-vector is unique in its elements' composition and contains consecutive elements.

  3. Elements of a particular sub-vector overlap with elements of previous and subsequent sub-vector.

  4. The same Element must not overlap more than l - 1 times in consecutive sub-vectors as starting element of every sub-vector is arranged in ascending order.

  5. Some block elements can be rebounding, in such a way that the start-to-end elements of the parent vector make a sub-vector.

  6. The input should be a vector for the parent vector v, and an integer for the block length l. While the output should be a list of vectors (not a matrix) such that each sub-vector is output as a vector and the list of all sub-vectors is a list.

For illustration, consider a parent vector of x1 to x10 with a subvector size of l = 3 consecutive elements of its parent vector as follows:

x1, x2, x3
    x2, x3, x4
            x4, x5, x6
                x5, x6, x7
                        x7, x8, x9
                            x8, x9, x10
                                    x10, x1, x2

What I do is form a series of subvectors each with length l =3 with starting elements being progressive in nature (x1, x2 x4, x5, x7, x8, x10) and not recursive. The third sub-vector starts from x4 and not x3 because starting it from x3 will make x3 overlap l times and not l - 1 times at that stage. The same consideration is applied to the 5th and the 7th sub-vector.

For another illustration, consider a parent vector of x1 to x10 with a subvector size of l = 4 consecutive elements of its parent vector as follows:

x1, x2, x3, x4
    x2, x3, x4, x5
        x3, x4, x5, x6
                x5, x6, x7, x8
                    x6, x7, x8, x9
                        x7, x8, x9, x10
                                x9, x10, x1, x2
                                    x10, x1, x2, x3

What I do is form a series of subvectors each with length l =3 with starting elements being progressive in nature (x1, x2, x3, x5, x6, x7, x9, x10) and not recursive. The third sub-vector starts from x5 and not x4 because starting it from x4 will make x4 overlap l times and not l - 1 times at that stage. The same consideration is applied to the 8th and the 7th sub-vector.

Yet another illustration, consider a parent vector of x1 to x10 with a subvector size of l = 5 consecutive elements of its parent vector as follows:

x1, x2, x3, x4, x5
    x2, x3, x4, x5, x6
        x3, x4, x5, x6, x7
            x4, x5, x6, x7, x8
                    x6, x7, x8, x9, x10
                        x7, x8, x9, x10, x1
                            x8, x9, x10, x1, x2
                                x9, x10, x1, x2, x3

Yet another illustration, consider a parent vector of x1 to x10 with a subvector size of l = 6 consecutive elements of its parent vector as follows:

x1, x2, x3, x4, x5, x6
    x2, x3, x4, x5, x6, x7
        x3, x4, x5, x6, x7, x8
            x4, x5, x6, x7, x8, x9
                x5, x6, x7, x8, x9, x10
                        x7, x8, x9, x10, x1, x2
                            x8, x9, x10, x1, x2, x3
                                x9, x10, x1, x2, x3, x4
                                    x10, x1, x2, x3, x4, x5

Yet another illustration, consider a parent vector of x1 to x10 with a subvector size of l = 7 consecutive elements of its parent vector as follows:

x1, x2, x3, x4, x5, x6, x7
    x2, x3, x4, x5, x6, x7, x8
        x3, x4, x5, x6, x7, x8, x9
            x4, x5, x6, x7, x8, x9, x10
                x5, x6, x7, x8, x9, x10, x1
                    x6, x7, x8, x9, x10, x1, x2
                            x8, x9, x10, x1, x2, x3, x4
                                x9, x10, x1, x2, x3, x4, x5
                                    x10, x1, x2, x3, x4, x5, x6

Yet another illustration, consider a parent vector of x1 to x10 with a subvector size of l = 8 consecutive elements of its parent vector as follows:

x1, x2, x3, x4, x5, x6, x7, x8
    x2, x3, x4, x5, x6, x7, x8, x9
        x3, x4, x5, x6, x7, x8, x9, x10
            x4, x5, x6, x7, x8, x9, x10, x1
                x5, x6, x7, x8, x9, x10, x1, x2
                    x6, x7, x8, x9, x10, x1, x2, x3
                        x7, x8, x9, x10, x1, x2, x3, x4
                                x9, x10, x1, x2, x3, x4, x5, x6
                                    x10, x1, x2, x3, x4, x5, x6, x7

Yet another illustration, consider a parent vector of x1 to x10 with a subvector size of l = 9 consecutive elements of its parent vector as follows:

x1, x2, x3, x4, x5, x6, x7, x8, x9
    x2, x3, x4, x5, x6, x7, x8, x9, x10
        x3, x4, x5, x6, x7, x8, x9, x10, x1
            x4, x5, x6, x7, x8, x9, x10, x1, x2
                x5, x6, x7, x8, x9, x10, x1, x2, x3
                    x6, x7, x8, x9, x10, x1, x2, x3, x4
                        x7, x8, x9, x10, x1, x2, x3, x4, x5
                            x8, x9, x10, x1, x2, x3, x4, x5, x6
                                    x10, x1, x2, x3, x4, x5, x6, x7, x8

What I Need

I need an R code that gives me the output I want according to the conditions above. You can use v <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) for parent vector input with your choice of 1 < l < length(v) in your R code test.


Solution

  • You can do this with zoo::rollapply().

    library(zoo)
    
    fun <- \(v, l) c(v, v[1:(l-1)]) |> # append vector beginning to also get combinations of first and last elements
      rollapply(l, \(x) x) |> # apply window
      split(seq_along(v)) |> # from matrix to list
      (`[`)(seq_along(v) %% l != 0) # remove unwanted subvectors
    
    
    fun(1:10, 3)
    #> $`1`
    #> [1] 1 2 3
    #> 
    #> $`2`
    #> [1] 2 3 4
    #> 
    #> $`4`
    #> [1] 4 5 6
    #> 
    #> $`5`
    #> [1] 5 6 7
    #> 
    #> $`7`
    #> [1] 7 8 9
    #> 
    #> $`8`
    #> [1]  8  9 10
    #> 
    #> $`10`
    #> [1] 10  1  2
    
    fun(1:10, 4)
    #> $`1`
    #> [1] 1 2 3 4
    #> 
    #> $`2`
    #> [1] 2 3 4 5
    #> 
    #> $`3`
    #> [1] 3 4 5 6
    #> 
    #> $`5`
    #> [1] 5 6 7 8
    #> 
    #> $`6`
    #> [1] 6 7 8 9
    #> 
    #> $`7`
    #> [1]  7  8  9 10
    #> 
    #> $`9`
    #> [1]  9 10  1  2
    #> 
    #> $`10`
    #> [1] 10  1  2  3