I’m struggling with an issue regarding the complete()
function that I can’t resolve. I managed to have the desired behaviour another dataframe but I can’t make it work the same way for this one even though the structure of the data is similar.
So here is the dataframe :
df2<- structure(list(Season = c("2023/2024", "2023/2024", "2023/2024",
"2023/2024", "2023/2024", "2023/2024", "2023/2024", "2023/2024",
"2023/2024", "2023/2024", "2023/2024", "2023/2024", "2023/2024",
"2023/2024", "2023/2024", "2023/2024", "2023/2024", "2023/2024",
"2023/2024", "2023/2024", "2023/2024", "2023/2024"), DATEMM = structure(c(2023.58333333333,
2023.58333333333, 2023.66666666667, 2023.66666666667, 2023.75,
2023.75, 2023.83333333333, 2023.83333333333, 2023.91666666667,
2023.91666666667, 2024, 2024, 2024.08333333333, 2024.08333333333,
2024.16666666667, 2024.16666666667, 2024.25, 2024.25, 2024.33333333333,
2024.33333333333, 2024.41666666667, 2024.41666666667), class = "yearmon"),
team = c("Brest", "Clermont", "Brest", "Clermont", "Brest",
"Clermont", "Brest", "Clermont", "Brest", "Clermont", "Brest",
"Clermont", "Brest", "Clermont", "Brest", "Clermont", "Brest",
"Clermont", "Brest", "Clermont", "Brest", "Clermont"), goal = c(0,
0, 5, 2, 8, 5, 11, 7, 14, 9, 25, 11, 29, 14, 34, 16, 37,
19, 49, 25, 53, 26), rank = c(7L, 9L, 7L, 17L, 12L, 17L,
12L, 17L, 12L, 17L, 4L, 18L, 3L, 18L, 4L, 18L, 6L, 18L, 3L,
18L, 3L, 18L), image_file = c("~/teamlg/Brest.png", "~/teamlg/Clermont.png",
"~/teamlg/Brest.png", "~/teamlg/Clermont.png", "~/teamlg/Brest.png",
"~/teamlg/Clermont.png", "~/teamlg/Brest.png", "~/teamlg/Clermont.png",
"~/teamlg/Brest.png", "~/teamlg/Clermont.png", "~/teamlg/Brest.png",
"~/teamlg/Clermont.png", "~/teamlg/Brest.png", "~/teamlg/Clermont.png",
"~/teamlg/Brest.png", "~/teamlg/Clermont.png", "~/teamlg/Brest.png",
"~/teamlg/Clermont.png", "~/teamlg/Brest.png", "~/teamlg/Clermont.png",
"~/teamlg/Brest.png", "~/teamlg/Clermont.png")), row.names = c(NA,
-22L), class = c("tbl_df", "tbl", "data.frame"))
I want to complete DATEMM to have intermediate points so since it’s class yearmon I first convert to numerical values according to and then complete with the sequence :
fp_interp = 3
df3 <- df2 %>%
mutate(DATEMM_num = as.numeric(floor(12*DATEMM + .0001) / 12)) %>%
complete(DATEMM_num = seq(min(DATEMM_num), max(DATEMM_num), (1/12)/fp_interp),
nesting(team,image_file,Season))
I need each DATEMM_num
to have the same number of rows. However when I run table(df3$DATEMM_num)
, I see that I don’t have the same number of entries for each, which should be the case : some have 2 when others have 4.
Thank you for your help.
There might be a rounding issue here. Here is a way to check:
# CREATE DF2 LIKE YOU DID ALREADY
fp_interp = 3
# create df3 without completing the rows for now
df3 <- df2 %>%
mutate(DATEMM_num = as.numeric(floor(12*DATEMM + .0001) / 12))
# this is an array with the intermediate points
date_with_intermediate_points <- seq(min(DATEMM_num), max(DATEMM_num), (1/12)/fp_interp)
# create a df that keeps only df3$DATEMM_num that are absent from the array with intermediate points
df3_check <- df3 %>% filter(!DATEMM_num %in% date_with_intermediate_points)
You will see that some DATEMM_num differ from the calculated intermediate points, although when you eyeball them, they look the same: e.g. 2023.667, 2023.917.
Here is a solution that is a bit more roundabout, but seems to work:
fp_interp = 3
# just calculate DATEMM_num, making sure to round it to 3 digits
df3 <- df2 %>%
mutate(DATEMM_num = round(as.numeric(floor(12*DATEMM + .0001) / 12), 3))
# create the array with intermediate points
date_with_intermediate_points <- seq(min(DATEMM_num), max(DATEMM_num), (1/12)/fp_interp)
# use it to create a data frame with these points and the "team" column
df_list_of_dates <- data.frame(DATEMM_num = round(date_with_intermediate_points, 3),
Clermont = 'Clermont',
Brest = 'Brest') %>%
pivot_longer(cols = c('Clermont', 'Brest'), values_to="team") %>%
select(DATEMM_num, team)
# Merge the two data frames
df4 <- merge(df_list_of_dates, df3,
by = c("DATEMM_num", "team"),
all = TRUE)
I hope this helps!