I'm trying to use forcats::fct_relevel
to specify the levels in a column, the way I've used it in ggplot
, but it's giving an error about "unknown levels".
Here is a chart of the cheeses I have eaten per month:
cheeses<-tribble(
~mymonth, ~Brie, ~Stilton,
1, 4, 2,
2, 4, 1,
3, 1, 3,
4, 1, 5,
5, 2, 4,
6, 3, 1
)
and a list of the months:
cheesemonth<-c("Jan", "Feb", "Mar", "Apr", "May", "Jun")
According to pages like this one, I should be able to do the following:
cheeses %>%
mutate(mymonth=factor(mymonth)) %>%
mutate(mymonth=fct_relevel(mymonth, cheesemonth))
and have the items in mymonth
replaced by the items in cheesemonth
. But instead I get:
6 unknown levels in `f`: Jan, Feb, Mar, Apr, May, and Jun
and I'm at a loss to understand why.
If I replace the last line with:
mutate(mymonth=case_match(mymonth, "1" ~ "Jan", "2" ~ "Feb", "3" ~ "Mar", "4" ~ "Apr", "5" ~ "May", "6" ~ "Jun"))
then it's fine, but this is more typing, and means I can't re-use the cheesemonth
list.
So why do I get the unknown levels
error?
fct_relevel
reorders levels. To change the labels, which forcats calls values, use lvls_revalue
library(forcats)
lvls_revalue(as.character(cheeses$mymonth), cheesemonth)
$$ [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun
or use fct
library(forcats)
fct(cheesemonth[cheeses$mymonth], cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun
It is even easier with base R:
factor(cheeses$mymonth, labels = cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan Feb Mar Apr May Jun
or given that months have a natural order you may wish to create an ordered factor (also base R):
ordered(cheeses$mymonth, labels = cheesemonth)
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan < Feb < Mar < Apr < May < Jun
Note that R has a built-in month.abb
vector (English only) so we could eliminate cheesemonth
and write:
month.abb
## [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
ordered(cheeses$mymonth, labels = month.abb[1:6])
## [1] Jan Feb Mar Apr May Jun
## Levels: Jan < Feb < Mar < Apr < May < Jun
or to allow for months that are not present in the data
ordered(cheeses$mymonth, levels = 1:12, labels = month.abb)
## [1] Jan Feb Mar Apr May Jun
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec