rdataframedplyrsplit

How can I split a vector of numbers into its digits when vector values differ in length?


I'm having trouble with the following in R using dplyr and separate. I have a dataframe like this:

x<-c(1,2,12,31,123,2341)
df<-data.frame(x)

and I'd like to split each digit into its constituent parts and create a new variable for each into the dataframe, like so:

     x  a   b   c   d
1    1
2    2
3   12  1   2
4   31  3   1
5  123  1   2   3
6 2341  2   3   4   1

I have tried:

df <- df |>
mutate(x = as.character(x))|>
separate(x, c("a", "b", "c", "d"), sep=1, remove=F)

but I get:

Error in names(out) <- as_utf8_character(into) :
  'names' attribute [4] must be the same length as the vector [2]

Solution

  • Using separate_longer_position from tidyr:

    library(tidyr)
    library(dplyr)
    
    df |>
      mutate(value=x) |>
      separate_longer_position(value, width = 1) |>
      mutate(name=letters[row_number()], .by=x) |>
      pivot_wider()
    
    # A tibble: 6 × 5
          x a     b     c     d
      <dbl> <chr> <chr> <chr> <chr>
    1     1 1     NA    NA    NA
    2     2 2     NA    NA    NA
    3    12 1     2     NA    NA
    4    31 3     1     NA    NA
    5   123 1     2     3     NA
    6  2341 2     3     4     1