rtidyverse

what happened when `separate_wider_regex` in multiple match


I tested the following script, but it displays an error, what happened?

I jut want save Y & 33 & N & A (Y33@N(A)) into four separate columns.

library('tidyverse')

tta <- data.frame(res=c("Y33@N(A)","H5@O(B)"))
ttb <- tta %>%
    separate_wider_regex(res, patterns=c(resn="^[A-Za-z]+",
                                         resi="\\d+(?=@)",
                                         name="(?<=\\@)[a-zA-Z]+(?=\\()",
                                         chain="(?<=\\()[a-zA-Z]+(?=\\))"
                                         ),
                         too_few="debug")

Solution

  • Motivated by "Unnamed components will match, but not be included in the output" in the online doc description of the patterns parameter, this

    ttb <- tta %>%
      separate_wider_regex(res, patterns=c(resn="^[A-Za-z]+",
                                           resi="\\d+",
                                           "\\@",
                                           name="[a-zA-Z]+",
                                           "\\(",
                                           chain="[a-zA-Z]+",
                                           "\\)$"
      ))
    

    appears to give you what you want for your test data, though I suspect your use case may require more complicated regexes.

    ttb
    # A tibble: 2 × 4
      resn  resi  name  chain
      <chr> <chr> <chr> <chr>
    1 Y     33    N     A    
    2 H     5     O     B