rtidyverse

Split a string into rows by delimiter and add column with order


I know how to split a string into rows by a delimiter from

V1 V2
1 h,e
2 l,l,o

to

V1 V2
1 h
1 e
2 l
2 l
2 o

with tidyr::separate_longer_delim(V2, delim = ",") but how can I add a column with the original order of the individual string parts as in

V1 V2 order
1 h 1
1 e 2
2 l 1
2 l 2
2 o 3

?


Solution

  • Using purrr::map to get the order variable, then unnest to get the elements per row

    library(dplyr)
    library(tidyr)
    
    df %>% 
      mutate(V2 = strsplit(V2, ","), 
             order = purrr::map(lengths(V2), ~ seq_len(.x))) %>% 
      unnest(-V1)
    # A tibble: 5 × 3
         V1 V2    order
      <int> <chr> <int>
    1     1 h         1
    2     1 e         2
    3     2 l         1
    4     2 l         2
    5     2 o         3
    
    Data
    df <- structure(list(V1 = 1:2, V2 = c("h,e", "l,l,o")), class = "data.frame", 
    row.names = c(NA, -2L))