rregexstring

Using a gsub to split a name based on special characters


I have the following names which I am trying to change into first name, last name format.

myNamesDup
"Cross K / Cross M"  "Davis L/Harper C"  "Williams M / 
Brown M G" "Greening E-L / Roberts J" "Jones Williams D/Browning A"

What I am trying to achieve;

"K Cross/M Cross"  "L Davis/C Harper"  "M Williams/M G 
 Brown " "E-L Greening/j Roberts" "A Jones Williams/A Browning"

I have tried a few things for this unsuccessfully mainly due to special characters I think.

The closest I got to this is by using;

gsub("([A-Za-z]*) (.*)/(.*) ([A-Za-z]*)", "\\2 \\1/\\4 \\3", myNamesDup)

This gets me pretty close seems to fails on double barrel names and 2 first initials


Solution

  • You could try strsplit with sub like below

    > unlist(lapply(strsplit(myNames, "/"), \(x) paste0(sub("(\\S+) (.*)", "\\2 \\1", trimws(x)), collapse = "/")))
    [1] "K Cross/M Cross"             "L Davis/C Harper"
    [3] "M Williams/M G Brown"        "E-L Greening/J Roberts"
    [5] "Williams D Jones/A Browning"
    

    or sub only like below

    > gsub("(\\S+) (.*)\\s?/\\s?(\\S+) (.*)", "\\2 \\1/\\4 \\3", myNames)
    [1] "K  Cross/M Cross"            "L Davis/C Harper"
    [3] "M  Williams/M G Brown"       "E-L  Greening/J Roberts"
    [5] "Williams D Jones/A Browning"