rforcats

fct_reorder with a lookup data.frame


I have the following data.frame.

library(tidyverse)

df <- 
  tibble::tribble(
    ~year, ~sales_rep, ~sale_count,
    2021,        "1",         615,
    2021,        "2",         246,
    2021,        "3",         245,
    2022,        "1",         736,
    2022,        "2",          56,
    2022,        "3",         868,
    2023,        "1",         452,
    2023,        "2",         185,
    2023,        "3",         915
  )

# convert sales_rep into a factor
df <-
  df %>% 
  mutate(
    across(
      sales_rep,
      as_factor
    )
  )

# look at levels
levels(df$sales_rep)
#> [1] "1" "2" "3"

I would like to reorder the levels of sales_rep by percent increase in sales over average using the data.frame below. However, df_ranking has fewer rows than df, so fct_reorder() throws an error. How can I get around this?

# calculate percent change between 2023 and all previous years in sales by rep
df_ranking <-
  df %>% 
    mutate(
      year_grouping = 
        case_when(
          year == 2023 ~ 'sales_2023',
          year < 2023 ~ 'sales_previous'
        )
    ) %>% 
    summarize(
      mean_sale_count = mean(sale_count),
      .by = c(year_grouping, sales_rep)
    ) %>% 
    pivot_wider(
      names_from = year_grouping,
      values_from = mean_sale_count
    ) %>% 
    mutate(
      pct_change = (sales_2023 - sales_previous)/sales_previous
    )

# this doesn't work
df %>% 
  mutate(
    sales_rep = fct_reorder(sales_rep, df_ranking$pct_change)
  )
#> Error in `mutate()`:
#> i In argument: `sales_rep = fct_reorder(sales_rep,
#>   df_ranking$pct_change)`.
#> Caused by error in `fct_reorder()`:
#> ! length(f) == length(.x) is not TRUE
#> Backtrace:
#>      x
#>   1. +-df %>% ...
#>   2. +-dplyr::mutate(., sales_rep = fct_reorder(sales_rep, df_ranking$pct_change))
#>   3. +-dplyr:::mutate.data.frame(., sales_rep = fct_reorder(sales_rep, df_ranking$pct_change))
#>   4. | \-dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
#>   5. |   +-base::withCallingHandlers(...)
#>   6. |   \-dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
#>   7. |     \-mask$eval_all_mutate(quo)
#>   8. |       \-dplyr (local) eval()
#>   9. +-forcats::fct_reorder(sales_rep, df_ranking$pct_change)
#>  10. | \-base::stopifnot(length(f) == length(.x))
#>  11. |   \-base::stop(simpleError(msg, call = if (p <- sys.parent(1L)) sys.call(p)))
#>  12. \-dplyr (local) `<fn>`(`<smplErrr>`)
#>  13.   \-rlang::abort(message, class = error_class, parent = parent, call = error_call)

Created on 2024-01-15 with reprex v2.0.2


Solution

  • Instead of using forcats::fct_reorder() you could use factor() and provide the levels argument in the desired order.

    First you dplyr::arrange() df_ranking to make sure that sales_rep is in the desired order. Then we extract sales_rep with dplyr::pull().

    pct_change_order <-
      df_ranking |>
      arrange(-pct_change) |>
      pull(sales_rep)
    

    Now, we can apply the desired factor order:

    res <- 
      df %>%
      mutate(sales_rep = factor(sales_rep, pct_change_order))
    
    levels(res$sales_rep)
    #> [1] "3" "2" "1"