rdplyrgroup-summaries

Using across function in dplyr


I have a dataframe which contains missing values.

# Create dataframe
df <- data.frame(Athlete = c(c(replicate(200,"Ali"), c(replicate(200,"Tyson")))),
                 Score = replicate(400, sample(c(1:20, NA), 1, rep = TRUE)))

My function groups factors, then counts rows which do not contain NA values.

library(dplyr)
Result <- df %>%
  dplyr::group_by(Athlete, .drop = TRUE) %>%
  dplyr::summarise_each(list(~sum(!is.na(.))))

I get the desired result. But there is a warning message.

`summarise_each_()` is deprecated as of dplyr 0.7.0.
Please use `across()` instead.

I'm trying to update the code base so the warning messages stop.

Note: The warning message also says;

This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 

So if the warning message is absent, reboot Rstudio and rerun script to produce the warning message.


Solution

  • summarise_each was replaced with summarise_at/summarise_all which is now replaced with across in dplyr 1.0.0.

    library(dplyr)
    df %>%
     group_by(Athlete) %>%
     summarise(across(everything(), ~sum(!is.na(.))))
    
    #  Athlete Score
    #  <chr>   <int>
    #1 Ali       189
    #2 Tyson     195
    

    Although, if you have only one column to summarise as shown you can do this directly :

    df %>%
      group_by(Athlete, .drop = TRUE) %>%
      summarise(Score  = sum(!is.na(Score)))