rdataframesplitsubstringtidyr

Split all columns in half by character position


My dataframe has a bunch of columns, each with six characters. Like this:

df = data.frame("A" = c("001000","101010","131313"),"B" = c("121212","434343","737373"),"C" = c("555555","959595","383838"))
       A      B      C
1 001000 121212 555555
2 101010 434343 959595
3 131313 737373 383838

I would like to split every column in half, making new columns, one column containing the first 3 characters, the other containing the last three. Like this:

final_df = data.frame("A" = c("001","101","131"),"A1" = c("000","010","313"),
                      "B" = c("121","434","737"),"B1" = c("212","343","373"),
                      "C" = c("555","959","383"),"C1" = c("555","595","838"))
    A  A1   B  B1   C
1 001 000 121 212 555
2 101 010 434 434 959
3 131 313 737 373 383
4 001 000 121 212 555
5 101 010 434 343 595
6 131 313 737 373 838

The new names of the columns don't matter, provided that they are in the same order.

I've used transform() or separate() for a single column, but since they require a new column name, I'm not sure how to use them for all columns.


Solution

  • You could use tidyr::separate_wider_position():

    df <- data.frame(
      "A" = c("001000", "101010", "131313"),
      "B" = c("121212", "434343", "737373"),
      "C" = c("555555", "959595", "383838")
    )
    
    df |>
      tidyr::separate_wider_position(
        cols = A:C, # or cols = everything(),
        widths = c(`1` = 3, `2` = 3),
        names_sep = "."
      )
    
    #> # A tibble: 3 × 6
    #>   A.1   A.2   B.1   B.2   C.1   C.2  
    #>   <chr> <chr> <chr> <chr> <chr> <chr>
    #> 1 001   000   121   212   555   555  
    #> 2 101   010   434   343   959   595  
    #> 3 131   313   737   373   383   838