rdata-cleaningstringr

Spliting 1 column into 2 with str_split() in R


I have this dataframe

   # My dataframe
df <- data.frame(
  id_do_cliente = c(852, 966, 677, 877, 176, 69, 688, 525, 307, 127),
  nome_completo = c(
    "John Smith", "Emily Johnson", "Michael Brown", 
    "Sarah Davis", "James Miller", "Emma Wilson", 
    "Olivia Moore", "William Taylor", "Sophia Anderson", 
    "Isabella Thomas"
  )
)

# Exibindo o dataframe
print(df)

I would like to create 2 columns for the first and second names from column complete_name.

I have to use function str_split() with the arguments simplify = FALSE

I am doing this, but I cant access the first name and the second name:

  df %>%
    mutate(
      first_name = unlist(str_split(complete_name, " ", simplify = FALSE,2)[[1]][1]),
      second_name = unlist(str_split(complete_name, " ", simplify = FALSE,2)[[2]][2])) 

What am I doing wrong?

PS: I need to stay as closer as possible to ths code. I believe the problem is the way I am using the operators [[

Any help guys?


Solution

  • You need to tell R to perform this operation for each row and the first [[ accesses the list returned by str_split so in both cases the first index should be a one:

    df %>%
      rowwise() %>% 
      mutate(
        first_name = unlist(str_split(complete_name, " ", simplify = FALSE,2)[[1]][1]),
        second_name = unlist(str_split(complete_name, " ", simplify = FALSE,2)[[1]][2]))
    

    If you don't have to use str_split I would recommend to take a look at separate_* functions like separate_wider_delim