rdataframevectorfactoring

Factoring categorical variable vectors in a single column of a data frame?


I'm working on importing a data set which has a column with categories "PR","CG","SH","CF","SC","PI","PA". However, some rows have multiple values (e.g. PR,CG). I was able to split those strings into lists using FFG=str_split(FFG,pattern=","), but when I try to factor using df<-df%>%(FFG=col_factor(levels=c("PR","CG","SH","CF","SC","PI","PA"))) I get "Error in function_list[k] : attempt to apply non-function" back. I'm new to R so if I missed any important information, just let me know. Any advice would be incredibly helpful, thank you!


Solution

  • One option is to use separate_rows to split the 'FFG' column and then convert to factor with levels specified

    library(dplyr)
    library(tidyr)
    df %>%
        separate_rows(FFG, sep=",") %>%
        mutate(FFG = factor(FFG, levels=c("PR","CG","SH","CF","SC","PI","PA")))