I have a data frame as such:
data <- data.frame(daytime = c('2005-05-03 11:45:23', '2005-05-03 11:47:45',
'2005-05-03 12:00:32', '2005-05-03 12:25:01',
'2006-05-02 10:45:15', '2006-05-02 11:15:14',
'2006-05-02 11:16:15', '2006-05-02 11:18:03'),
category = c("A", "A", "A", "B", "B", "B", "B", "A"))
print(data)
daytime category date2
1 2005-05-03 11:45:23 A 05/03/05
2 2005-05-03 11:47:45 A 05/03/05
3 2005-05-03 12:00:32 A 05/03/05
4 2005-05-03 12:25:01 B 05/03/05
5 2006-05-02 10:45:15 B 05/02/06
6 2006-05-02 11:15:14 B 05/02/06
7 2006-05-02 11:16:15 B 05/02/06
8 2006-05-02 11:18:03 A 05/02/06
I would like to turn this data frame into a time series of daily categorical frequencies like this:
day cat_A_freq cat_B_freq
1 2005-05-01 3 1
2 2005-05-02 1 3
I have tried doing:
library(anytime)
data$daytime <- anytime(data$daytime)
data$day <- factor(format(data$daytime, "%D"))
table(data$day, data$category)
A B
05/02/06 1 3
05/03/05 3 1
But as you can see the formatting a new variable, day, changes the appearance of the date. You can also see that the table does not return the days in proper order (the years are out of order) so that I can then convert to a time series, easily.
Any ideas on how to get frequencies in an easier way, or if this is the way, how to get the frequencies in correct date order and into a dataframe for easy conversion to a time series object?
A solution using tidyverse. The format of your daytime
column in your data is good, so we can use as.Date
directly without specifying other formats or using other functions.
library(tidyverse)
data2 <- data %>%
mutate(day = as.Date(daytime)) %>%
count(day, category) %>%
spread(category, n)
data2
# # A tibble: 2 x 3
# day A B
# * <date> <int> <int>
# 1 2005-05-03 3 1
# 2 2006-05-02 1 3