rdplyrmutateacross

Calculating percentages across multiple columns in dplyr


I'm attempting to mutate three numeric columns into percentages based on the sum of each column, using mutate(across(.cols = c(...))). My usual method of doing this for one column works very well:

mutate(`Percentage`= ((`Count`/sum(.$`Count`))*100))

When I apply a similar principle to a multi-column mutate call using .x or . to stand in for all called values, it instead divides each value by itself. Where am I going wrong?

# Sample Code

test<-data.frame(fruit=c("Apples","Pears","Bananas"),
                 `John`=c(1,13,34),
                 `Jacob`=c(5,9,2))%>%
  group_by(`fruit`)%>%
  mutate(`Total`=sum(`John`,`Jacob`))

# A tibble: 3 × 4
# Groups:   fruit [3]
  fruit    John Jacob Total
  <chr>   <dbl> <dbl> <dbl>
1 Apples      1     5     6
2 Pears      13     9    22
3 Bananas    34     2    36

fruiteaten<-test%>%
  mutate(across(.cols=c(`John`,
                        `Jacob`,
                        `Total`), .fns = ~ ((.x/sum(.x))*100)))

# Output

# A tibble: 3 × 4
# Groups:   fruit [3]
  fruit    John Jacob Total
  <chr>   <dbl> <dbl> <dbl>
1 Apples    100   100   100
2 Pears     100   100   100
3 Bananas   100   100   100

# Desired Output

  fruit    John Jacob Total
1 Apples   0.02  0.31  0.09
2 Pears    0.27  0.56  0.34
3 Bananas  0.70  0.12  0.56

Solution

  • You should not be grouping:

    test |> 
        ungroup() |> 
        mutate(across(-fruit, \(x) x / sum(x) * 100))
    
    # A tibble: 3 × 4
      fruit    John Jacob Total
      <chr>   <dbl> <dbl> <dbl>
    1 Apples   2.08  31.2  9.38
    2 Pears   27.1   56.2 34.4 
    3 Bananas 70.8   12.5 56.2 
    

    P.s. you only need to use `backticks` for variables that are not legal names, such as names with spaces or illegal characters.