rforcats

R tidyverse: fct_relevel "unknown levels"


I'm trying to use forcats::fct_relevel to specify the levels in a column, the way I've used it in ggplot, but it's giving an error about "unknown levels".

Here is a chart of the cheeses I have eaten per month:

cheeses<-tribble(
  ~mymonth, ~Brie, ~Stilton,
  1, 4, 2, 
  2, 4, 1,
  3, 1, 3,
  4, 1, 5,
  5, 2, 4,
  6, 3, 1
)

and a list of the months:

cheesemonth<-c("Jan", "Feb", "Mar", "Apr", "May", "Jun")

According to pages like this one, I should be able to do the following:

cheeses %>% 
  mutate(mymonth=factor(mymonth)) %>% 
  mutate(mymonth=fct_relevel(mymonth, cheesemonth))

and have the items in mymonth replaced by the items in cheesemonth. But instead I get:

6 unknown levels in `f`: Jan, Feb, Mar, Apr, May, and Jun 

and I'm at a loss to understand why.

If I replace the last line with:

mutate(mymonth=case_match(mymonth, "1" ~ "Jan", "2" ~ "Feb", "3" ~ "Mar", "4" ~ "Apr", "5" ~ "May", "6" ~ "Jun"))

then it's fine, but this is more typing, and means I can't re-use the cheesemonth list.

So why do I get the unknown levels error?


Solution

  • fct_relevel reorders levels. To change the labels, which forcats calls values, use lvls_revalue

    library(forcats)
    
    lvls_revalue(as.character(cheeses$mymonth), cheesemonth)
    $$ [1] Jan Feb Mar Apr May Jun
    ## Levels: Jan Feb Mar Apr May Jun
    

    or use fct

    library(forcats)
    
    fct(cheesemonth[cheeses$mymonth], cheesemonth)
    ## [1] Jan Feb Mar Apr May Jun
    ## Levels: Jan Feb Mar Apr May Jun
    

    It is even easier with base R:

    factor(cheeses$mymonth, labels = cheesemonth)
    ## [1] Jan Feb Mar Apr May Jun
    ## Levels: Jan Feb Mar Apr May Jun
    

    or given that months have a natural order you may wish to create an ordered factor (also base R):

    ordered(cheeses$mymonth, labels = cheesemonth)
    ## [1] Jan Feb Mar Apr May Jun
    ## Levels: Jan < Feb < Mar < Apr < May < Jun
    

    Note that R has a built-in month.abb vector (English only) so we could eliminate cheesemonth and write:

    month.abb
    ## [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
    
    ordered(cheeses$mymonth, labels = month.abb[1:6])
    ## [1] Jan Feb Mar Apr May Jun
    ## Levels: Jan < Feb < Mar < Apr < May < Jun
    

    or to allow for months that are not present in the data

    ordered(cheeses$mymonth, levels = 1:12, labels = month.abb)
    ## [1] Jan Feb Mar Apr May Jun
    ## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec