The title is rather confusing (apologies), but I will try to explain it here.
I have a dataset with multiple names of individuals, the number of coffees they have had on a particular day of the week, and the corresponding day of the week in 3 columns
What I want as an output, is the 'day' (column 3), when Tom, Mary, and John have the 'maximum' amount of coffee. i.e. Tom - Monday, Mary - Friday, John - Monday as a table.
x=data.frame(read.csv(file.choose()))
tom=subset(x,x$Name=="Tom")
tom.max=max(tom)
tom$Day[which.max(tom$Coffees)]
I could do this for one name, but struggling to use a loop to do it for all the names together, so I wouldn't have to repeat it for every single name in a bigger dataset. I want the output to look something like this
Any help would be appreciated.
This can be achieved easily with the dplyr
package. Please note that posting screen-shot images of your data is frowned upon - better to cut-and-paste the actual numbers (as text) or use the dput()
function.
df <- read.table(sep="", header=TRUE, text=c("
Name Coffees Day
Tom 2 Monday
Tom 1 Wednesday
Mary 3 Friday
Tom 2 Thursday
John 3 Monday
Tom 4 Monday
Mary 5 Tuesday
Tom 6 Friday
John 1 Saturday
"))
df
#> Name Coffees Day
#> 1 Tom 2 Monday
#> 2 Tom 1 Wednesday
#> 3 Mary 3 Friday
#> 4 Tom 2 Thursday
#> 5 John 3 Monday
#> 6 Tom 4 Monday
#> 7 Mary 5 Tuesday
#> 8 Tom 6 Friday
#> 9 John 1 Saturday
suppressPackageStartupMessages(library(tidyverse))
df %>%
group_by(Name) %>%
arrange(desc(Coffees), .by_group=TRUE) %>%
slice_head(n=1)
#> # A tibble: 3 × 3
#> # Groups: Name [3]
#> Name Coffees Day
#> <chr> <int> <chr>
#> 1 John 3 Monday
#> 2 Mary 5 Tuesday
#> 3 Tom 6 Friday
Created on 2024-05-15 with reprex v2.1.0