I'm trying to calculate the mean of each column in my dataframe and return the mean value to each row in the dataframe, across multiple columns of a similar name. My thought was to use the mutate(across(starts_with()))
functions to call up the columns I want to manipulate, then use ~summarize(mean())
to calculate the mean of each column and mutate the original values of each column. However, I get an error that says that summarize()
can't be used with my class of data in the Fruits - Apples
column. When I checked that column with str()
, it confirmed that the values were of a character class, so I converted everything with as.numeric()
. I still get the same error when I run my code.
# Sample Data
test<-structure(list(`Fruits - Apples` = c("1", "4"), `Fruits - Oranges` = c("2",
"6"), `Fruits - Bananas` = c("5", "3")), row.names = c(NA, -2L
), class = c("tbl_df", "tbl", "data.frame"))
> test
# A tibble: 2 × 3
`Fruits - Apples` `Fruits - Oranges` `Fruits - Bananas`
<chr> <chr> <chr>
1 1 2 5
2 4 6 3
# Attempted Code
nicetry<-test%>%
mutate(across(everything(), ~as.numeric(.x)))%>%
mutate(across(starts_with("Fruits -"), ~ summarize(mean = mean(.x, na.rm = T))))
# Error Code
Error in `mutate()`:
ℹ In argument: `across(starts_with("Fruits -"), ~summarize(mean = mean(.x, na.rm = T)))`.
Caused by error in `across()`:
! Can't compute column `Fruits - Apples`.
Caused by error in `UseMethod()`:
! no applicable method for 'summarise' applied to an object of class "c('double', 'numeric')"
Run `rlang::last_trace()` to see where the error occurred.
# Desired Output
`Fruits - Apples` `Fruits - Oranges` `Fruits - Bananas`
2.5 4 4
2.5 4 4
Don't use summarize
inside mutate
.
If you want the same number of rows as the input, you use mutate
:
test %>%
mutate(across(everything(), as.numeric)) %>%
mutate(across(starts_with("Fruits -"), ~mean(.x, na.rm = TRUE)))
# # A tibble: 2 × 3
# `Fruits - Apples` `Fruits - Oranges` `Fruits - Bananas`
# <dbl> <dbl> <dbl>
# 1 2.5 4 4
# 2 2.5 4 4
If you want one row per group (1 row in this case as you haven't set any groups), use summarize
:
test %>%
mutate(across(everything(), as.numeric)) %>%
summarize(across(starts_with("Fruits -"), ~mean(.x, na.rm = TRUE)))
# # A tibble: 1 × 3
# `Fruits - Apples` `Fruits - Oranges` `Fruits - Bananas`
# <dbl> <dbl> <dbl>
# 1 2.5 4 4
Also note that if you are applying a function with no extra arguments, like as.numeric
above, then you don't need the ~foo(.x)
, you can just say foo
.