rtidyrrvest

how do split or separate character strings in rows of a column


library(rvest)
library(tidyverse)

url = 'https://en.wikipedia.org/wiki/2023_Nigerian_House_of_Representatives_election'

html_content = read_html(url)

tables = html_table(html_content, fill = TRUE)

tb_abia = tables[[9]]

#remove first row
names(tb_abia) <- tb_abia[1,]
tb_abia <- tb_abia[-1,] 
view(tb_abia)

tb_abia$Status
[1] "Incumbent renominatedNew member electedLP gain"                                                                         
[2] "Incumbent retiredNew member electedLP gain"  

the above column (tb_abia$Status) is what i want to split into: "Incumbent renominated", "New member elected", "LP gain","incumbent retired" etc


Solution

  • The comment from Marijn is a nice solution for your problem.

    Here's another "quick and dirty" suggestion:

    tb_abia |> 
    mutate(Status=gsub("(New member elected)", '_\\1_',Status)) |>
    separate(Status, into=c("S1","S2","S3"),sep="_")
    

    Note that you might want to add more split cases, so, feel free to modify the code provided.