I have a simple dataset, and I'm trying to find cities with more than 3 observations (n)
. However, I'm encountering an error when using the fct_lump()
function. Could you help me identify the issue?
tablo1 |>
count(sehir, sort = TRUE)
sehir n
<chr> <int>
1 Adana 2
2 Adıyaman 1
3 Afyonkarahisar 2
4 Aksaray 1
5 Amasya 1
6 Ankara 23
7 Antalya 5
8 Ardahan 1
9 Artvin 1
10 Aydın 1
# ℹ 71 more rows
# ℹ Use `print(n = ...)` to see more rows
Here's the current code that results in an error:
tablo1 |>
count(sehir) |>
filter(fct_lump(sehir, 5, w = n))
The error message I'm receiving is:
Error in `filter()`:
ℹ In argument: `fct_lump(sehir, 5, w = n)`.
Caused by error:
! `..1` must be a logical vector, not a <factor> object.
Run `rlang::last_trace()` to see where the error occurred.
What am I doing wrong?
rlang::last_trace()
<error/rlang_error>
Error in `filter()`:
ℹ In argument: `fct_lump(sehir, 5, w = n)`.
Caused by error:
! `..1` must be a logical vector, not a <factor> object.
---
Backtrace:
▆
1. ├─dplyr::filter(count(tablo1, sehir), fct_lump(sehir, 5, w = n))
2. ├─dplyr:::filter.data.frame(count(tablo1, sehir), fct_lump(sehir, 5, w = n))
3. │ └─dplyr:::filter_rows(.data, dots, by)
4. │ └─dplyr:::filter_eval(...)
5. │ ├─base::withCallingHandlers(...)
6. │ └─mask$eval_all_filter(dots, env_filter)
7. │ └─dplyr (local) eval()
8. └─dplyr:::dplyr_internal_error(...)
Run rlang::last_trace(drop = FALSE) to see 5 hidden frames.
For fct_lump
& co you might want to start with uncounted values; with fct_lump_min(..., min = 4)
you'd be left with factor levels with "more than 3 observations" + Other which you can then count:
library(dplyr, warn.conflicts = FALSE)
library(forcats)
# uncount first to get "original" dataset
tablo1 <- read.table(header = TRUE, text="
sehir n
1 Adana 2
2 Adıyaman 1
3 Afyonkarahisar 2
4 Aksaray 1
5 Amasya 1
6 Ankara 23
7 Antalya 5
8 Ardahan 1
9 Artvin 1
10 Aydın 1") |>
tidyr::uncount(n) |>
as_tibble()
glimpse(tablo1)
#> Rows: 38
#> Columns: 1
#> $ sehir <chr> "Adana", "Adana", "Adıyaman", "Afyonkarahisar", "Afyonkarahisar"…
tablo1 |>
mutate(sehir = fct_lump_min(sehir, 4)) |>
count(sehir)
#> # A tibble: 3 × 2
#> sehir n
#> <fct> <int>
#> 1 Ankara 23
#> 2 Antalya 5
#> 3 Other 10
Created on 2024-02-01 with reprex v2.0.2