I have a data.frame which contains a factor variable at the beginning. I would like to first change the order of the factor levels, and then sort the data.frame to be ordered by those factor levels in the new order.
My problem is that the labels for my real factor levels are very long, and I would rather do the re-ordering by indices instead. I do need to re-order manually as there is no automated sort that would fit my needs.
I tried using indices with fct_reorder()
however I get incomprehensible results. The factor is reordered but not in the order I specified by numbers.
How can I use numbers to specify how the factor should be reordered? I would prefer a tidyverse solution.
Here is what I tried:
# Load tidyverse:
library(tidyverse)
# Create example data frame:
mydf <- data.frame(measure = c("strong", "less strong", "least strong", "fast", "slow"),
cases = c(5,2,11,23,15),
jan = c(2,1,3,4,1),
feb = c(1,0,1,2,3))
mydf <- mydf %>%
# Convert to factor:
mutate(measure = factor(measure)) %>%
# Reorder 'measure' as follows: slow, least strong, less strong, strong, fast
mutate(measure = fct_reorder(.f = measure, .x = c(4,2,3,5,1))) %>%
# Arrange data.frame by reordered levels of factor 'measure':
arrange(measure)
Converting to a factor (before manually ordering) gives me this (levels in alphabetical order) which is what I used to determine the indices that I should pass to fct_reorder()
:
> levels(mydf$measure)
[1] "fast" "least strong" "less strong" "slow"
[5] "strong"
The code runs without an error, but I get this, which is not in the order I specified (less strong and least strong are in the wrong place):
> mydf
measure cases jan feb
1 slow 15 1 3
2 less strong 2 1 0
3 least strong 11 3 1
4 strong 5 2 1
5 fast 23 4 2
I also tried starting the level numbers with a 0
instead of 1
which reorders the levels again but still not the order I wanted them in. There doesn't seem to be any logic (that I can see) in how it is reordering either.
You can use fct_relevel()
to relevel by your indices:
library(dplyr)
library(forcats)
mydf %>%
mutate(measure = fct_relevel(measure, levels(measure)[c(4,2,3,5,1)])) %>%
arrange(measure)
measure cases jan feb
1 slow 15 1 3
2 least strong 11 3 1
3 less strong 2 1 0
4 strong 5 2 1
5 fast 23 4 2