I have a dataframe, df
, like so:
df
date periodNumber value
1 2020-02-02 0 814
2 2020-02-09 0 1592
3 2020-02-09 1 1808
4 2020-02-16 0 2746
5 2020-02-16 1 2646
6 2020-02-16 2 2993
str(df)
'data.frame': 6 obs. of 3 variables:
$ date : Date, format: "2020-02-02" "2020-02-09" "2020-02-09" "2020-02-16" ...
$ periodNumber: int 0 0 1 0 1 2
$ value : num 814 1592 1808 2746 2646 2993
As can ben seen, df$date
is in the Date
format. I have a for loop to add rows to this dataframe, like so:
for(i in 1:nrow(df)){
df[nrow(df)+1,] <- c(df$date[i]+1, df$periodNumber[i], df$periodNumber[i])
}
However, this gives the error Error in as.Date.numeric(e) : 'origin' must be supplied
. Why is this? The loop works if I remove the Date
column. I've also tried versions of adding as.Date()
to the for loop, but that gives the same error. For example, the code below gives the same error:
for(i in 1:nrow(df)){
df[nrow(df)+1,] <- c(as.Date(df$date[i], origin='1970-01-01')+1, df$periodNumber[i], df$periodNumber[i])
}
The c
in the for
loop wants to coerce the three values into a single format.
In ?c
we can read:
The output type is determined from the highest type of the components in the hierarchy NULL < raw < logical < integer < double < complex < character < list < expression. [...] factors are treated only via their internal integer codes.
I'm actually not sure where "Date"
can be categorized here. One might expect that it would be coerced to integer, but it's date. And because there is no origin=
supplied (where should it actually be defined??) it throws an error.
This is actually interesting, since
c(1, as.Date("2020-01-01"))
# [1] 1 18262
but
c(as.Date("2020-01-01"), 1)
# Error in as.Date.numeric(e) : 'origin' must be supplied
This is at least an inconsistency, maybe a bug, I could not find any documentation.
Anyway, use data.frame
to allow for multiple classes.
for(i in 1:nrow(df)){
df[nrow(df)+1,] <- data.frame(df$date[i]+1, df$periodNumber[i], df$periodNumber[i])
}
# date periodNumber value
# 1 2020-02-02 0 814
# 2 2020-02-09 0 1592
# 3 2020-02-09 1 1808
# 4 2020-02-16 0 2746
# 5 2020-02-16 1 2646
# 6 2020-02-16 2 2993
# 7 2020-02-03 0 0
# 8 2020-02-10 0 0
# 9 2020-02-10 1 1
# 10 2020-02-17 0 0
# 11 2020-02-17 1 1
# 12 2020-02-17 2 2
Data:
df <- structure(list(date = structure(c(18294, 18301, 18301, 18308,
18308, 18308), class = "Date"), periodNumber = c(0L, 0L, 1L,
0L, 1L, 2L), value = c(814L, 1592L, 1808L, 2746L, 2646L, 2993L
)), row.names = c("1", "2", "3", "4", "5", "6"), class = "data.frame")