rdataframedplyrsplitstackshape

Expand rows in data frame by length of sequence


I have a data frame like this

   mydf <- data.frame(x = c("a", "b", "q"), 
                      y = c("c", "d", "r"), 
                      min = c(2, 5, 3), 
                      max = c(4,6,7))

        x y min max
        a c   2   4
        b d   5   6
        q r   3   7

and I would like to expand it to this using a simple function

   mydf_expanded <- data.frame(x = c(rep("a",3), rep("b",2), rep("q",5)), 
                               y = c(rep("c",3), rep("d",2), rep("r",5)), 
                               min_to_max = c(2:4,5:6,3:7))

       x y min_to_max
       a c          2
       a c          3
       a c          4
       b d          5
       b d          6
       q r          3
       q r          4
       q r          5
       q r          6
       q r          7

Which is basically trying to expand mydf by length(mydf$min:mydf:max) of each row of mydf. But I also want to add that min_to_max column which is the sequence mydf$min:mydf:max within each row.

I know about this function which can do an expansion based on a "count" you specify

       library(splitstackshape)
       mydf_expand = expandRows(mydf, "count")

But this doesn't exactly get me to my solution. Please help! Thanks!


Solution

  • df <- data.frame(x = c("a", "b", "q"), 
                       y = c("c", "d", "r"), 
                       min = c(2, 5, 3), 
                       max = c(4,6,7))
    library(tidyverse)
    
    df %>% 
      rowwise() %>% 
      transmute(x, y, all = list(seq(min, max))) %>% 
      unnest_longer(all)
    #> # A tibble: 10 x 3
    #>    x     y       all
    #>    <chr> <chr> <int>
    #>  1 a     c         2
    #>  2 a     c         3
    #>  3 a     c         4
    #>  4 b     d         5
    #>  5 b     d         6
    #>  6 q     r         3
    #>  7 q     r         4
    #>  8 q     r         5
    #>  9 q     r         6
    #> 10 q     r         7
    

    Created on 2021-03-11 by the reprex package (v1.0.0)

    data.table

    library(data.table)
    setDT(df)
    
    df$all <- apply(df[, c(3,4)], 1, function(x) seq(x[1], x[2]))
    
    df[, list(all = unlist(all)), by = list(x, y)]
    #>     x y all
    #>  1: a c   2
    #>  2: a c   3
    #>  3: a c   4
    #>  4: b d   5
    #>  5: b d   6
    #>  6: q r   3
    #>  7: q r   4
    #>  8: q r   5
    #>  9: q r   6
    #> 10: q r   7
    

    Created on 2021-03-11 by the reprex package (v1.0.0)